Skip to content

Finalfusion in Python

Latest
Compare
Choose a tag to compare
@sebpuetz sebpuetz released this 05 Jun 07:30

This release marks a major change to finalfusion-python: the entire package has been rewritten in Python and is no longer a wrapper around finalfusion-rust.

The API is now almost on par with finalfusion-rust and in some places even goes beyond that.

  • Vocab, Storage, Metadata and Norms are now accessible as properties on Embeddings
  • Any of the chunks above can be loaded by themselves from a finalfusion file
  • All chunks can be constructed from within Python
    • It's possible to add, remove or change embeddings
  • Storage types integrate directly with numpy arrays
  • Reading and writing to all common Embedding formats (word2vec, GloVe, fastText) is supported
  • The API for vocabularies and subword indexers has been made mor ergonomic:
    • vocab words and the word -> index mapping are accessible as properties
    • SubwordVocabs expose the subword indexer through vocab.subword_indexer

In addition to the overhauled API, finalfusion-python now comes with executables:

  • ffp-convert to convert between embedding formats
  • ffp-similar and ffp-analogy for similarity and analogy queries
  • ffp-bucket-to-explicit to convert from bucket subword to explicit subword embeddings

Check out the documentation at https://finalfusion-python.readthedocs.io for more information!