A Markdown text splitter for modular docs and maximum flexibility.
SplitmeAI
is a Python module that addresses challenges in managing large Markdown files, particularly when creating and maintaining structured static documentation websites such as Mkdocs.
Key Features:
- Section Splitting: Breaks down large Markdown files into smaller, manageable sections based on specified heading levels.
- Filename Sanitization: Generates clean, unique filenames for each section, ensuring compatibility and readability.
- Reference Link Management: Extracts and appends reference-style links used within each section.
- Hierarchy Preservation: Maintains parent heading context within each split file.
- Thematic Break Handling: Recognizes and handles line breaks (
---
,***
,___
) for intelligent content segmentation. - MkDocs Integration: Automatically generates an
mkdocs.yml
configuration file based on the split sections. - CLI Support: Provides a user-friendly Command-Line Interface for seamless operation.
Install from PyPI using any of the package managers listed below.
Use pip (recommended for most users):
pip install -U splitme-ai
Install in an isolated environment with pipx:
❯ pipx install splitme-ai
For the fastest installation use uv:
❯ uv tool install splitme-ai
Let's take a look at some examples of how to use the splitme-ai
CLI.
Example 1: Split a Markdown file on heading level 2 (default setting):
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h2
Example 2: Split on heading level 2 and generate an mkdocs.yml configuration file:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h2 \
--split.settings.mkdocs
View the output generated for splitting on heading level 2 here.
Example 3: Split on heading level 3:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h3 \
--split.settings.hl "###"
View the output generated for splitting on heading level 3 here.
Example 4: Split on heading level 4:
splitme-ai \
--split.i examples/data/README-AI.md \
--split.settings.o examples/output-h4 \
--split.settings.hl "####"
View the output generated for splitting on heading level 4 here.
Note
The Official Documentation site with extensive examples and usage instructions is under development Stay tuned for updates!
- Enhance CLI usability and user experience.
- Integrate AI-powered content analysis and segmentation.
- Add robust chunking and splitting algorithms for LLM applications.
- Add support for additional static site generators.
- Add support for additional input and output formats.
Contributions are welcome! For bug reports, feature requests, or questions, please open an issue or submit a pull request on GitHub.
Copyright © 2024 splitme-ai.
Released under the MIT license.