Skip to content

v0.8.3

Compare
Choose a tag to compare
@lukehsiao lukehsiao released this 11 Sep 20:27
· 61 commits to master since this release

0.8.3 - 2020-09-11

This is a big release with a lot of changes. These changes are summarized here. Check the Changelog for more details.

Added

Changed

  • @YasushiMiyata: Enable RegexMatchSpan with concatenates words by sep="(separator)" option. (#270) (#492)
  • @HiromuHota: Enabled "Type hints (PEP 484) support for the Sphinx autodoc extension." (#421)
  • @HiromuHota: Switched the Cython wrapper for Mecab from mecab-python3 to fugashi. Since the Japanese tokenizer remains the same, there should be no impact on users. (#384) (#432)
  • @HiromuHota: Log a stack trace on parsing error for better debug experience. (#478) (#479)
  • @HiromuHota: get_cell_ngrams and get_neighbor_cell_ngrams yield nothing when the mention is not tabular. (#471) (#504)

Deprecated

Fixed

  • @senwu: Fix pdf_path cannot be without a trailing slash. (#442) (#459)
  • @kaikun213: Fix bug in table range difference calculations. (#420)
  • @HiromuHota: mention_extractor.apply with clear=True now works even if it's not the first run. (#424)
  • @HiromuHota: Fix get_horz_ngrams and get_vert_ngrams so that they work even when the input mention is not tabular. (#425) (#426)
  • @HiromuHota: Fix the order of args to Bbox. (#443) (#444)
  • @HiromuHota: Fix the non-deterministic behavior in VisualLinker. (#412) (#458)
  • @HiromuHota: Fix an issue that the progress bar shows no progress on preprocessing by executing preprocessing and parsing in parallel. (#439)
  • @HiromuHota: Adopt to mlflow>=1.9.0. (#461) (#463)
  • @HiromuHota: Correct the entity type for NumberMatcher from "NUMBER" to "CARDINAL". (#473) (#477)
  • @HiromuHota: Fix _get_axis_ngrams not to return None when the input is not tabular. (#481)
  • @HiromuHota: Fix Visualizer.display_candidates not to draw rectangles on wrong pages. (#488)
  • @HiromuHota: Persist doc only when no error happens during parsing. (#489) (#490)