Skip to content

eli64s/splitme-ai

Repository files navigation

SplitMe-AI Logo

Break down your docs. Build up your knowledge.

A Markdown text splitter for modular docs and maximum flexibility.

separator

What is SplitmeAI?

SplitmeAI is a Python module that addresses challenges in managing large Markdown files, particularly when creating and maintaining structured static documentation websites such as Mkdocs.

Key Features:

  • Section Splitting: Breaks down large Markdown files into smaller, manageable sections based on specified heading levels.
  • Filename Sanitization: Generates clean, unique filenames for each section, ensuring compatibility and readability.
  • Reference Link Management: Extracts and appends reference-style links used within each section.
  • Hierarchy Preservation: Maintains parent heading context within each split file.
  • Thematic Break Handling: Recognizes and handles line breaks (---, ***, ___) for intelligent content segmentation.
  • MkDocs Integration: Automatically generates an mkdocs.yml configuration file based on the split sections.
  • CLI Support: Provides a user-friendly Command-Line Interface for seamless operation.

Quick Start

Installation

Install from PyPI using any of the package managers listed below.

 pip

Use pip (recommended for most users):

pip install -U splitme-ai

 pipx

Install in an isolated environment with pipx:

❯ pipx install splitme-ai

 uv

For the fastest installation use uv:

❯ uv tool install splitme-ai

Usage

Using the CLI

Let's take a look at some examples of how to use the splitme-ai CLI.

Example 1: Split a Markdown file on heading level 2 (default setting):

splitme-ai \
    --split.i examples/data/README-AI.md \
    --split.settings.o examples/output-h2

Example 2: Split on heading level 2 and generate an mkdocs.yml configuration file:

splitme-ai \
    --split.i examples/data/README-AI.md \
    --split.settings.o examples/output-h2 \
    --split.settings.mkdocs

View the output generated for splitting on heading level 2 here.

Example 3: Split on heading level 3:

splitme-ai \
    --split.i examples/data/README-AI.md \
    --split.settings.o examples/output-h3 \
    --split.settings.hl "###"

View the output generated for splitting on heading level 3 here.

Example 4: Split on heading level 4:

splitme-ai \
    --split.i examples/data/README-AI.md \
    --split.settings.o examples/output-h4 \
    --split.settings.hl "####"

View the output generated for splitting on heading level 4 here.

Note

The Official Documentation site with extensive examples and usage instructions is under development Stay tuned for updates!


Roadmap

  • Enhance CLI usability and user experience.
  • Integrate AI-powered content analysis and segmentation.
  • Add robust chunking and splitting algorithms for LLM applications.
  • Add support for additional static site generators.
  • Add support for additional input and output formats.

Contributing

Contributions are welcome! For bug reports, feature requests, or questions, please open an issue or submit a pull request on GitHub.


License

Copyright © 2024 splitme-ai.
Released under the MIT license.

separator