DeLFT Documentation

DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (named-entity tagging, information extraction, document-structure tagging) and text classification (e.g. comment classification, citation classification). It re-implements standard state-of-the-art deep-learning architectures — both classical RNN/CNN models and transformer-based models loaded via HuggingFace — under a single API.

DeLFT is designed around three goals: covering rich text (tokens with layout / structural features, not just plain sentences), reproducibility and benchmarking under comparable evaluation criteria, and production-level performance and integration. A native Java integration of the library is available in GROBID via JEP.

The current release line is 0.4.x, tested with Python 3.10/3.11 and TensorFlow 2.17. See Introduction for the full feature overview, or jump straight to:

The full navigation is available in the sidebar on the left.