You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
0.16.12
Enhancements
Prepare auto-partitioning for pluggable partitioners. Move toward a uniform partitioner call signature so a custom or override partitioner can be registered without code changes.
Add NDJSON file type support.
Features
Fixes
Base image has been updated.
Upgrade ruff to latest. Previously the ruff version was pinned to <0.5. Remove that pin and fix the handful of lint items that resulted.
CSV with asserted XLS content-type is correctly identified as CSV. Resolves a bug where a CSV file with an asserted content-type of application/vnd.ms-excel was incorrectly identified as an XLS file.
Improve element-type mapping for Chinese text. Fixes bug where Chinese text would produce large numbers of false-positive Title elements.
Improve element-type mapping for HTML. Fixes bug where certain non-title elements were classified as Title.