The All-in-One GraphRAG Framework for enterprise-grade AI with document-centric ingestion, dual Tree+Graph retrieval, and verifiable attribution.
VeritasGraph is an enterprise-grade GraphRAG framework built on the principle "Don't Chunk. Graph." — ingesting whole pages or sections as graph nodes instead of traditional 500-token chunks, preserving document structural integrity. The framework employs a dual Tree + Graph retrieval architecture: PageIndex-style hierarchical TOC navigation runs in parallel with knowledge graph semantic reasoning, supporting cross-section linking and multi-hop reasoning for complex cross-document questions. Every generated claim provides 100% verifiable attribution traceable to exact source document locations, making it suitable for high-compliance domains such as legal, medical, and financial sectors.
Retrieval & Reasoning#
- Tree-based Navigation: PageIndex-style hierarchical TOC navigation with cross-section linking
- Graph-based Semantic Search: Knowledge graph-connected semantic retrieval, not mere vector similarity matching
- Multi-hop Reasoning: Complex reasoning across documents and sections
- Document-Centric Ingestion: Whole pages/sections as nodes, avoiding context loss from chunking
Ingestion Sources#
- PDF: Via
pipeline.ingest_pdf()or CLIveritasgraph ingest - YouTube: Automatic subtitle extraction from URL
- Web Articles: Direct URL ingestion via CLI
- Plain Text: Standard text ingestion
- Charts/Tables: Vision RAG mode converts to knowledge graph nodes
Verifiability & Visualization#
- Verifiable Attribution: Every claim includes a precise attribution path traceable to exact source locations
- Interactive Graph Visualization: PyVis-powered 2D graph browser showing entities, relations, and reasoning paths in real time
Deployment Modes#
| Mode | Description | Dependencies |
|---|---|---|
lite | Cloud API, zero config | OpenAI-compatible API Key |
local | Fully offline, Ollama local inference | Ollama (8GB RAM required) |
full | Production-grade, one-click Docker | Docker + Neo4j + Ollama |
LLM/Embedding Compatibility#
Unified through an OpenAI-compatible API abstraction, supporting mixed configurations (e.g., Groq for LLM + Ollama for Embeddings): OpenAI, Azure OpenAI, Groq, Together AI, OpenRouter, LM Studio, vLLM, Ollama.
Architecture#
- Graph Engine Layer: Based on Microsoft GraphRAG for indexing and querying, Neo4j as persistent graph database
- Retrieval Layer: Tree-based navigation and Graph-based semantic search running in parallel
- Document Processing Layer: Document-centric ingestion with whole pages/sections as single retrievable nodes
- LLM Abstraction Layer: OpenAI-compatible API interface unifying multiple local/cloud LLM providers
- Visualization Layer: PyVis interactive 2D graph browser, Gradio Web UI
Installation & Quick Start#
pip install veritasgraph
veritasgraph demo --mode=lite
Optional dependencies: veritasgraph[web] (Gradio UI + visualization), veritasgraph[graphrag] (Microsoft GraphRAG integration), veritasgraph[ingest] (YouTube & web ingestion), veritasgraph[all] (all features).
Docker one-click deployment (full mode):
cd docker/five-minute-magic-onboarding
docker compose up --build
# Ports: Gradio UI :7860, Neo4j Browser :7474, Ollama API :11434
Python API Example#
from veritasgraph import VisionRAGPipeline, VisionRAGConfig
pipeline = VisionRAGPipeline()
doc = pipeline.ingest_pdf("document.pdf")
result = pipeline.query("What are the key findings?")
print(result.answer)
config = VisionRAGConfig(ingest_mode="document-centric")
pipeline = VisionRAGPipeline(config)
doc = pipeline.ingest_pdf("annual_report.pdf")
print(pipeline.get_document_tree())
section = pipeline.navigate_to_section("Methodology")
Key Environment Variables#
| Variable | Purpose |
|---|---|
GRAPHRAG_API_KEY | LLM API key |
GRAPHRAG_LLM_MODEL | LLM model name |
GRAPHRAG_LLM_API_BASE | LLM API base URL |
GRAPHRAG_EMBEDDING_API_KEY | Embedding API key |
GRAPHRAG_EMBEDDING_MODEL | Embedding model name |
GRAPHRAG_EMBEDDING_API_BASE | Embedding API base URL |
Unconfirmed Information#
- Exact PyPI version and release date (JS rendering limitation on PyPI page)
- Independent website/docs URL (README mentions "Live documentation" but no URL provided)
- Formal publication of the accompanying paper (PDF in repo, no arXiv or journal link found)
- Deployed HuggingFace Space address
- Whether GPU is mandatory for local mode
- Performance benchmarks for large-scale document sets