Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add serialization support to FstAddOn #162

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

ebraraktas
Copy link

Description

This PR adds the possibility to serialize/deserialize a FstAddOn from binary data. This makes it possible to load a lookahead FST (special MatcherFst) from binary data.

Changes

  • Added SerializeBinary requirement for SerializableFst trait. This made it possible to read/write an FST from/to binary data slice, instead of a path.
  • Added SerializeBinary for IntInterval, VectorIntervalStore, IntervalSet and LabelReachableData. These implementations are ported from relevant OpenFst implementations to make it compatible with files created by OpenFst. Label is assumed as 32 bit data, because OpenFst uses int for Label.
  • Added Fst<W> requirement for fst field of FstAddOn. I think this is better to add this requirement by definition of the struct.
  • Implemented SerializeBinary for FstAddOn with AddOnPair ((Option<Arc<AO1>>, Option<Arc<AO2>>)). For now, this seems to be the only variant of FstAddOn used in the project. More generic implementation may be added.
    • fst_type field is added to FstAddOn to implement this, OpenFst has a type name field in AddOnImpl as well. I have given names {i,o}label_lookahead to FstAddOn variables defined in matcher_fst.rs. Names are taken from OpenFst, and they seem compatible with the binary output of OpenFst.
  • Finally, added new constructor to MatcherFst allowing to create it from already computed (or read) FstAddOn.

Status

  • These changes do not break any API currently available.
  • Current tests of the project are not affected from these changes, and they pass as expected.
  • I have tested this with a olabel_lookahead FST file created with OpenFst. I did not added a proper test yet. I think it may be necessary to add it.

`SerializableFst` trait already needs its type to be converted to binary data. Therefore, adding `SerializeBinary` requirement for the trait makes it possible to add default implementations of `SerializableFst::read` and `SerializableFst::write`. In addition, having an API to convert an FST to/from binary data may be useful for other serialization implementations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant