Skip to content
This repository has been archived by the owner on Aug 9, 2023. It is now read-only.

Releases: wellcometrust/WellcomeML

v2021.2.0

05 Feb 14:29
Compare
Choose a tag to compare
v2021.2.0 Pre-release
Pre-release

Major changes

  • Upgrade spacy to v3.0
  • Add native HuggingFace support (#191), re-writting BertClassifier using transformers
  • Disables HDBscan from the possible clustering techniques due to a conflict with the new numpy version (#197)

Bug fixes

  • Resolves issues #195 and #198 with thew pip reference resolver, introduced in pip>20.3

v2021.1.0

19 Jan 13:13
3fc07c3
Compare
Choose a tag to compare
v2021.1.0 Pre-release
Pre-release
Merge pull request #183 from wellcometrust/feature/nsorros/upgrade-sp…

v2020.11.1

24 Nov 10:26
7f98e75
Compare
Choose a tag to compare
v2020.11.1 Pre-release
Pre-release
Merge pull request #180 from wellcometrust/fix-dataclasses-version

Pin dataclasses to 0.6 as required by spacy-transformers

v2020.11.0

18 Nov 15:37
adeecc5
Compare
Choose a tag to compare
v2020.11.0 Pre-release
Pre-release
Merge pull request #175 from wellcometrust/feature/bert_vectorizer_pr…

v2020.9.0

07 Sep 11:00
72b3350
Compare
Choose a tag to compare
v2020.9.0 Pre-release
Pre-release

Models

  • Add param l2, dense_size, attention_heads, metrics, callbacks, feature_approach to cnn and bilstm classifiers
  • Faster predictions in BertClassifier through the use of spacy's pipe
  • Adds TextClustering

v2020.7.1

30 Jul 10:38
1daa29d
Compare
Choose a tag to compare
v2020.7.1 Pre-release
Pre-release

Bugs

  • Fixes pypi conflict by pinning down dependencies.

v2020.7.0

09 Jul 14:07
3a8097b
Compare
Choose a tag to compare
v2020.7.0 Pre-release
Pre-release

Models

  • Adds Doc2VecVectorizer
  • Adds WellcomeVotingClassifier
  • Adds Sent2VecVectorizer
  • Adds SemanticEquivalenceMetaClassifier
  • Adds CategoricalMetrics and MetricMiniBatchHistory

Datasets

  • Adds CONLL dataset
  • Adds Winer dataset

Features

  • Automatically load models like en_core_web_sm and en_trf_bertbaseuncased_lg but also download packages like sent2vec, only when needed
  • Adds docs based on sphinx and read the docs

Repo

  • Adds pep8 / flake8 checks and address violations
  • Adds badges for build, codecov and license
  • Adds pull request template that forces link to issue or trello

Bugs

  • Fix dependency on non pypi packages for tests
  • Pin spacy transformers to 0.5.1
  • Fix codecov running in separate travis venv

v2020.5.1

20 May 17:22
912ab25
Compare
Choose a tag to compare
v2020.5.1 Pre-release
Pre-release

ML

  • Add l2 and validation_split to BertClassifier
  • Add kwargs and load model to semantic similarity

Extras

  • Add command line interface to download models
  • Removed models from deployment
  • Moved package to pypi

Pre-release v2020.5.0

13 May 12:03
43e5dcf
Compare
Choose a tag to compare
Pre-release v2020.5.0 Pre-release
Pre-release

ML

  • Add CNNClassifier
  • Add BiLSTMClassifier
  • Add attention layers
  • Add Semantic equivalence classifier
  • Add embedding based entity linker

Datasets

  • Add Hoc dataset

Pre-release v2020.4.0

07 Apr 10:21
1ca7f41
Compare
Choose a tag to compare
Pre-release v2020.4.0 Pre-release
Pre-release

ML

  • Add partial_fit to BERTClassifier
  • Add mean_last_four embedding to BertVectorizer
  • Use nlp.pipe for prediction as its quicker
  • Add generator to transform data on demand for spacy to reduce memory usage
  • Add multilabel and architecture parameter in SpacyClassifier
  • Modify SpacyClassifier to accept sparse Y for multilabel classification
  • Add pretrain_vectors_path parameters to SpacyClassifier
  • Add speed metric to SpacyClassifier and BertClassifier
  • Fix tests in BertClassifier to check for loss reduction after 5 iterations