Release Release 0.13.0 · flairNLP/flair

This release adds several major new features such as (1) faster and more memory-efficient transformer training, (2) a new plugin system for custom logging and training, (3) new API docs for better documentation - still in beta, and (4) various new models, datasets, bug fixes and enhancements. This release also increases the minimum requirement to Python 3.8!

New Feature: Faster and more memory-efficient transformer training

This release integrates @helpmefindaname's transformer-smaller-training-vocab into the ModelTrainer. This temporarily reduces a transformer's vocabulary to only the tokens in the training dataset, and after training restores the full vocabulary. Depending on the dataset, this may effect huge savings in GPU memory and tuning speeds.

To use this feature, simply add the flag reduce_transformer_vocab=True to the fine_tune method. For example, to fine-tune a distilbert model on TREC_6, run this code (step 7 has the flag to reduce the vocabulary):

# 1. get the corpus
corpus: Corpus = TREC_6()

# 2. what label do we want to predict?
label_type = "question_class"

# 3. create the label dictionary
label_dict = corpus.make_label_dictionary(label_type=label_type)

# 4. initialize transformer document embeddings (many models are available)
document_embeddings = TransformerDocumentEmbeddings("distilbert-base-uncased", fine_tune=True)

# 5. create the text classifier
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict, label_type=label_type)

# 6. initialize trainer
trainer = ModelTrainer(classifier, corpus)

# 7. fine-tune the model, but **reduce the vocabulary** for faster training
trainer.fine_tune(
    "resources/taggers/question-classification-with-transformer",
    reduce_transformer_vocab=True,  # set this to False for slow version
)

Involved PR: add reduce transformer vocab plugin by @helpmefindaname in #3217

New Feature: Trainer Plugins

A new "Plugin" system was added to the ModelTrainer, allowing far greater options to customize the training cycle (and slimming down the code of the ModelTrainer somewhat). For instance, it is now possible to customize logging to a far greater degree and integrate third-party logging tools.

For instance, if you want to integrate ClearML logging into the above script, simply instantiate the plugin and attach it to the trainer:

[...]

# 6. initialize trainer
trainer = ModelTrainer(classifier, corpus)

# NEW: instantiate a special logger and attach it to the trainer before the training run
ClearmlLoggerPlugin(clearml.Task.init(project_name="test", task_name="test")).attach_to(trainer)

# 7. fine-tune the model, but **reduce the vocabulary** for faster training
trainer.fine_tune(
    "resources/taggers/question-classification-with-transformer",
    reduce_transformer_vocab=True,  # set this to False for slow version
)

Involved PRs:

Proposal: Pluggable ModelTrainer train function by @plonerma in #3084
Major refactoring of ModelTrainer by @alanakbik in #3182
Allow users to use no scheduler and use a custom scheduling plugin by @plonerma in #3200
Don't pickle classes & plugins in modelcard by @helpmefindaname in #3325
Clearml logger by @helpmefindaname in #3259
Add a convenience conversion for flair.device by @alanakbik in #3350

API Docs and other documentation

We are working towards improving our documentation. A first step was the release of our tutorial page. Now, we are adding (in beta) online API docs to make navigating the code and options offered by Flair easier. To enable it, we changed all docstrings to Google docstrings. However, this process is still ongoing, so expect the API docs to improve in coming versions of Flair.

You can find the API docs here: https://flairnlp.github.io/flair/master/api/index.html

Involved PRs:

Creating a doc page with autodocs by @helpmefindaname in #3273
Google doc strings by @helpmefindaname in #3164
Add redirects to old tutorials by @alanakbik in #3211
Add some more documentation and (rather empty) glossary page by @helpmefindaname in #3339
Update README.md by @eltociear in #3241
Fix embedding finetuning tutorial by @helpmefindaname in #3301
Fix build doc page action trigger by @helpmefindaname in #3319
Reduce gh-actions diskspace by @helpmefindaname in #3327
Orange secondary color by @helpmefindaname in #3321
Bump Flair and Python versions by @alanakbik in #3355

Model Refactorings

In an effort to unify class names, we now offer models that inherit from DefaultClassifier for each label type we predict, i.e.:

TokenClassifier for predicting Token labels
TextPairClassifier for predicting TextPair labels
RelationClassifier for predicting Relation labels
SpanClassifier for predicting Span labels
TextClassifier for predicting Sentence labels

An advantage of such a structure is that most functionality (such as new decoders) needs to only be implemented once in DefaultClassifier and then is immediately usable for all model classes.

To enable this, we renamed and extended WordTagger as TokenClassifier, and renamed Entity Linker to SpanClassifier. This is not a breaking change yet, as the old names are still available. But in the future, WordTagger and Entity Linker will be removed.

Involved PRs:

TokenClassifier model by @alanakbik in #3203
Rename EntityLinker and remove some legacy embeddings by @alanakbik in #3295

New Models

We also add two new model classes: (1) a TextPairRegressor for regression tasks on pairs of sentences (such as STS-B), and (2) an experimental Label Encoder method for few-shot classification.

Involved PRs:

Add TextPair regression model by @plonerma in #3202
Add dual encoder by @whoisjones in #3208
Adapt LabelVerbalizer so that it also works for non-BIOES span labes by @alanakbik in #3231

New Datasets

Integrate BigBio NER data sets into HunFlair by @mariosaenger in #3146
Add datasets STS-B and SST-2 to flair by @plonerma in #3201
Extend German LER Dataset by @stefan-it in #3288
Add support for MasakhaPOS Dataset by @stefan-it in #3247
Gh3275: sample_missing_splits in SST-2 by @plonerma in #3276
Add German MobIE NER Dataset by @stefan-it in #3351

Build Process

Use ruff instead of flake8 and isort by @Lingepumpe in #3213
Update mypy by @Lingepumpe in #3210
Use poetry instead of pipenv for developer/testing by @Lingepumpe in #3214
Remove poetry by @helpmefindaname in #3258

Bug Fixes

Fix seralization of config in transformers by @helpmefindaname in #3178
Add stacklevel to log_line in order to display correct file and line number (backwards compatible) by @plonerma in #3175
Fix tars loading by @helpmefindaname in #3212
Fix best epoch score update by @lephong in #3220
Fix loading of (not so) old models by @helpmefindaname in #3229
Fix false warning for "An empty Sentence was created!" by @AbdiHaryadi in #3268
Fix bug with sentences that do not contain a single valid transformer token by @helpmefindaname in #3230
Fix loading of old models by @helpmefindaname in #3228
Fix multiple arguments destination by @helpmefindaname in #3272
Support transformers 4310 by @helpmefindaname in #3289
Fix import error by @helpmefindaname in #3336

Enhancements

Bump min version to 3.8 by @helpmefindaname in #3297
Use torch native amp by @helpmefindaname in #3128
Unpin gdown dependency by @helpmefindaname in #3176
get_spans_from_bio: Start new span for previous S- if class also changed by @Lingepumpe in #3195
Include flair/py.typed and requirements.txt in source distribution by @dobbersc in #3206
Better tars inference by @helpmefindaname in #3222
prevent fasttext embeddings to be stored separately by @helpmefindaname in #3293
recreate to_dict and add relations by @helpmefindaname in #3271
github: bug report description should be textarea by @stefan-it in #3181
Making gradient clipping optional & max gradient norm variable by @plonerma in #3240
Save final model only if save_final_model is True (even if the training is interrupted) by @plonerma in #3251
Fix inconsistency between best path and scores in ViterbiDecoder by @mauryaland in #3189
Add action to remove Awaiting Response label when an response was made by @helpmefindaname in #3300
Add onnx session config by @helpmefindaname in #3302
Feature jsonldataset metadata by @helpmefindaname in #3349

Breaking Changes

Removing the following legacy embeddings, as their support was droppend long ago:
- XLNetEmbeddings
- XLMEmbeddings
- OpenAIGPTEmbeddings
- OpenAIGPT2Embeddings
- RoBERTaEmbeddings
- CamembertEmbeddings
- XLMRobertaEmbeddings
- BertEmbeddings
  you can use TransformerWordEmbeddings or TransformerDocumentEmbeddings instead.
Removing ELMoTransformerEmbeddings as allennlp is no longer maintained.
Removal of the flair.hyperparameter module: We recommend using the hyperparameter optimzier of your choice as external module, for example see here how to fine tune flair models with the hugginface AutoTrain SpaceRunner
Drop of the trainer.resume(...) functionality. Similary to the flair.hyperparameter module, this functionality was dropped due to the trainer rework.
Changes to the trainer.train(...) and trainer.fine_tune(...) parameters:
- monitor_train: bool was replaced by monitor_train_sample: float: this allows you to specify the percentage of training data points used for monitoring (setting monitor_train_sample=1.0 is equivalent to the previous behaivour of monitor_train=True.
- eval_on_train_fraction is removed in favour of monitor_train_sample see monitor_train.
- eval_on_train_shuffle is removed.
- anneal_with_prestarts and batch_growth_annealing have been removed.
- num_workers has been removed, now there is always used a single worker for data loading, as it is the fastest for the inmemory datasets.
- checkpoint has been removed as parameter. You can use the CheckpointPlugin for the same behaviour.
- cycle_momentum has been removed, as schedulers have been moved to Plugins.
- param_selection_mode has been removed, similar to the hyper parameter optimization.
- optimizer_state_dict and scheduler_state_dict were removed as part of the resume functionality.
- anneal_against_dev_loss has been dropped, as the annealing goeas always against the metric specified by main_evaluation_metric
- use_swa has been removed
- use_tensorboard, tensorboard_comment tensorboard_log_dir & metrics_for_tensorboard are removed in favour of the TensorboardLogger plugin.
- amp_opt_level is removed, as we moved to the torch integration.
- WordTagger has been deprecated as it was renamed to TokenClassifier
- EntityLinker has been deprecated as it was renamed to SpanClassifier

New Contributors

@lephong made their first contribution in #3220
@AbdiHaryadi made their first contribution in #3268
@eltociear made their first contribution in #3241

Full Changelog: v0.12.2...v0.13.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.13.0

New Feature: Faster and more memory-efficient transformer training

New Feature: Trainer Plugins

API Docs and other documentation

Model Refactorings

New Models

New Datasets

Build Process

Bug Fixes

Enhancements

Breaking Changes

New Contributors

Contributors