Skip to content

Latest commit

 

History

History
80 lines (60 loc) · 2.86 KB

ragx2.md

File metadata and controls

80 lines (60 loc) · 2.86 KB

Kotamon: A Comprehensive Guide to an Innovative RAG Framework

  • Introduction

    • Kotamon: An open-source RAG framework with 11.5K stars on GitHub.
    • Focus on the project's core technologies.
    • Structure overview of the article.
  • PDF Parsing

    • Parsing via PDFThumbnailReader class.
    • Main steps in parsing process:
      • Load PDF file using PyPDF2.
      • Generate page thumbnails with Pillow.
      • Store text and thumbnails in documents list.
  • OCR Integration

    • OCRReader: Matplotlib, AdobeDocumentIntelligence, AdobeReader, UnstructuredReader.
    • Introduction of OCRReader for OCR-based PDF reading.
    • OCRReader uses FullOCR endpoint and unstructured for processing.
  • Document Chunking

    • Default class for chunking: TokenSplitter.
    • Splits document into chunks with parameters.
  • Re-ranking Methods

    • Available Methods:
      • CoherentRanking
      • LLMRanking
      • LLMThematicScoring
      • LLMScoring
  • GraphRAG

    • GraphRAG Indexing Pipeline explained.
    • Process: Standard file indexing and creating graph index.
    • GraphRAG Retrieval Pipeline explained.
    • Builds graph search context, formats context records into documents.
    • Plots the knowledge graph and retrieves documents.
  • Reasoning

    • Simple Reasoning Pipeline: Handles straightforward question-answering tasks.
    • Complex Reasoning Pipeline: Breaks down complex questions into smaller sub-questions.
    • Uses DecomposeQuestionPipeline class.
    • Supports methods such as ReAct and ReWOO.
    • Agent-Based Reasoning: Analysis of ReAct.
  • Query Processing

    • Identified query rewriting during reasoning processes.
    • Custom pipelines can be implemented for query expansion, classification, or refining.
  • Prompt Compression

    • No direct techniques implemented for prompt compression.
    • Potential for custom pipelines to reduce token load during processing.
  • LLMs and Embedding Models

    • Supports multiple platforms for LLMs: OpenAI, Ollama, Claude, and Groq.
    • Embedding models supported: OpenAI, Ollama, FastEmbed, and Cohere.
  • Web Service and Frontend

    • Built using Gradio for frontend UI.
    • Provides a clean interface for document uploads and QA.
  • Storage

    • Supports document stores (Docstores) and vector stores (Vectorstores).
    • Efficient retrieval for RAG systems.
  • Conclusion and Insights

    • Kotamon as a powerful RAG framework.
    • Supports multi-modal document parsing, GraphRAG, and agent-based reasoning.
    • Managing unstructured data, including complex images or high-resolution PDFs, remains challenging.
    • GraphRAG is a game-changer for knowledge-intensive queries by leveraging entities and relationships.
    • Further improvements could focus on more efficient indexing strategies or query refinement to enhance responsiveness.
    • The project's popularity and star count matter less than its technical usefulness.