DISCOVER THE FUTURE OF AI AGENTSarrow_forward

haiku.rag

calendar_todayAdded Feb 25, 2026
categoryAgent & Tooling
codeOpen Source
PythonKnowledge BaseMulti-Agent SystemModel Context ProtocolRAGAI AgentsAgent FrameworkAgent & ToolingDocs, Tutorials & ResourcesKnowledge Management, Retrieval & RAGProtocol, API & Integration

An opinionated, local-first Agentic RAG framework powered by LanceDB and Pydantic AI. Features hybrid search, multi-agent collaborative research, sandboxed code execution, and parsing of 40+ document formats, with MCP support.

Overview#

haiku.rag is a robust Retrieval-Augmented Generation (RAG) solution designed to handle private data in a local-first manner. It combines the efficient vector retrieval of LanceDB with the agent orchestration capabilities of Pydantic AI. Maintained by ggozad, current version 0.32.0, released under MIT License.

Core Features#

Multi-Modal Agent Support#

  • QA Agent: Precise Q&A with page number and section citations
  • Research Agent: Graph-based multi-step workflow (Plan → Search → Evaluate → Synthesize) using pydantic-graph
  • RLM Agent: Sandboxed Python code execution for cross-document computation and aggregation
  • Conversational RAG: Multi-turn dialogue with conversation memory

Deep Document Understanding#

  • Powered by Docling engine, supports 40+ formats including PDF, DOCX, PPTX, images
  • Preserves document logical structure (headings, paragraphs, page numbers) with context expansion
  • Visual Grounding: Highlight retrieved chunks on original page images

Hybrid Search Technology#

  • Vector search + Full-text search (BM25) + Reciprocal Rank Fusion
  • Reranking support: MxBAI, Cohere, Zero Entropy, vLLM
  • Time Travel: Query database state at specific historical timestamps

Deployment & Integration#

Storage Architecture#

  • Embedded LanceDB, no additional server required
  • Cloud storage support: S3, GCS, Azure, LanceDB Cloud
  • File system monitoring with automatic indexing

Interface Options#

  • Complete CLI and Python API
  • MCP Server: Integrates with Claude Desktop and other AI assistants
  • Inspector TUI: Terminal interface for browsing documents, chunks, and search results

Requirements#

  • Python 3.12+
  • Ollama (default Embedding and LLM backend)

Installation#

# Full installation
pip install haiku.rag

# Slim installation
pip install haiku.rag-slim

Quick Start#

# Index documents
haiku-rag add-src paper.pdf

# Hybrid search
haiku-rag search "attention mechanism"

# Q&A with citations
haiku-rag ask "What datasets were used?" --cite

# Research mode
haiku-rag research "What are the limitations?"

# MCP server
haiku-rag serve --mcp --stdio

Python API Example#

from haiku.rag.client import HaikuRAG

async with HaikuRAG("research.lancedb", create=True) as rag:
    await rag.create_document_from_source("paper.pdf")
    results = await rag.search("self-attention")
    answer, citations = await rag.ask("What is the complexity?")

Supported Providers#

  • Embeddings: Ollama (default), OpenAI, VoyageAI, LM Studio, vLLM
  • QA/Research: All Pydantic AI supported models

Use Cases#

  • Enterprise internal knowledge base Q&A
  • Academic literature research and analysis
  • Complex data analysis tasks requiring multi-document aggregation
  • Local memory/knowledge retrieval backend for AI assistants (e.g., Claude Desktop)

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch