DeLFT Documentation

DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (named-entity tagging, information extraction, document-structure tagging) and text classification (e.g. comment classification, citation classification). It re-implements standard state-of-the-art deep-learning architectures — both classical RNN/CNN models and transformer-based models loaded via HuggingFace — under a single API.

DeLFT is designed around three goals: covering rich text (tokens with layout / structural features, not just plain sentences), reproducibility and benchmarking under comparable evaluation criteria, and production-level performance and integration. A native Java integration of the library is available in GROBID via JEP.

The current release line is 0.4.x, tested with Python 3.10/3.11 and TensorFlow 2.17. See Introduction for the full feature overview, or jump straight to:

Install DeLFT — get a working environment in a few commands.
Embeddings — how DeLFT manages static word embeddings via LMDB.
NER, GROBID models, Snippet classification — ready-to-use applications and reproducibility tables.
Sequence Labeling and Text Classification — supported architectures and how to add your own.

The full navigation is available in the sidebar on the left.