Skip to content

0.16.12

Compare
Choose a tag to compare
@cragwolfe cragwolfe released this 05 Jan 22:06
· 3 commits to main since this release
1a94d95

0.16.12

Enhancements

  • Prepare auto-partitioning for pluggable partitioners. Move toward a uniform partitioner call signature so a custom or override partitioner can be registered without code changes.
  • Add NDJSON file type support.

Features

Fixes

  • Base image has been updated.
  • Upgrade ruff to latest. Previously the ruff version was pinned to <0.5. Remove that pin and fix the handful of lint items that resulted.
  • CSV with asserted XLS content-type is correctly identified as CSV. Resolves a bug where a CSV file with an asserted content-type of application/vnd.ms-excel was incorrectly identified as an XLS file.
  • Improve element-type mapping for Chinese text. Fixes bug where Chinese text would produce large numbers of false-positive Title elements.
  • Improve element-type mapping for HTML. Fixes bug where certain non-title elements were classified as Title.