DISCOVER THE FUTURE OF AI AGENTSarrow_forward

OpenRag

calendar_todayAdded Apr 25, 2026
categoryAgent & Tooling
codeOpen Source
PythonDockerKnowledge BaseFastAPIMultimodalRAGAgent & ToolingDocs, Tutorials & ResourcesKnowledge Management, Retrieval & RAGProtocol, API & IntegrationSecurity & Privacy

A lightweight, modular, sovereign-by-design RAG framework supporting multimodal document ingestion, hybrid retrieval with reranking, and an OpenAI API-compatible interface.

OpenRag is a full-stack RAG framework developed by French open-source company Linagora, built with a "sovereign-by-design" philosophy free from vendor lock-in. It covers the complete pipeline from document ingestion to QA generation: supporting multimodal file parsing (txt/md/pdf/docx/pptx/audio/images) with unified Markdown conversion; format-aware chunking with contextual summary enrichment; vectorization via jina-embeddings-v3 into Milvus; hybrid semantic+BM25 retrieval with Query Reformulation, HyDE, and GTE/Jina v2 multilingual reranking. The LLM layer uses an OpenAI API-compatible design, connecting to Mistral/GPT-4/Claude or local vLLM models, with independent VLM configuration for image understanding. It exposes a FastAPI service layer with an OpenAI-compatible Chat API for seamless integration with OpenWebUI, LangChain, and N8N. Built-in Indexer document management UI and Chainlit chat UI both support i18n. Features include multi-tenant partition isolation, Token/OIDC authentication, Ray distributed parallelism, Kubernetes Helm Chart deployment, and an automatic evaluation pipeline using UMAP+HDBScan. Licensed under AGPL-3.0, primarily Python (94.2%), current version v1.1.9.

Core Capabilities#

  • Multimodal Document Ingestion: Text (txt/md), documents (pdf/docx/doc/pptx with MarkerLoader default supporting OCR and complex layouts, optional Docling), audio (wav/mp3/mp4 etc. with auto-transcription), images (png/jpeg/jpg/svg with VLM-generated description text replacing originals), all formats unified to Markdown.
  • Hybrid Retrieval & Reranking: Semantic vector search + BM25 keyword search, Query Reformulation and HyDE query augmentation, optional multilingual reranking via Infinity Inference Server using GTE or Jina v2.
  • LLM-Agnostic Design: Supports any OpenAI API-compatible LLM and local vLLM-deployed models, independent VLM configuration for image understanding.
  • OpenAI API-Compatible Interface: Provides OpenAI-format compatible Chat API for integration with OpenWebUI, LangChain, N8N and other frontend/workflow tools.
  • Multi-Tenant Partitions: Documents organized by Partition, supporting isolation of different user/team document collections.
  • Automatic Evaluation Pipeline: Built-in UMAP + HDBScan clustering to generate synthetic QA datasets from indexed documents, local LLM scoring of query-chunk pairs to quantify retrieval quality.
  • Web UI: Native Indexer document management interface and Chainlit chat interface, both with i18n support.
  • Authentication: Token mode (default Bearer Token) and OIDC mode (supporting Keycloak, LemonLDAP::NG and other IdPs).
  • Distributed Scaling: Ray-based parallelization of chunking, embedding, and ingestion tasks across multi-node multi-GPU setups; Kubernetes Helm Chart (charts/openrag-stack) and Ansible playbooks for production deployment.

Deployment#

  • Quick Start: Docker Compose with GPU (NVIDIA Container Toolkit) and CPU profiles.
  • Production: Kubernetes Helm Chart or Ansible automation.
  • Prerequisites: Python 3.12+, Docker and Docker Compose.

Key Configuration#

SettingDescriptionDefault
BASE_URL / API_KEY / MODELLLM endpoint, key, model nameManual
VLM_BASE_URL / VLM_MODELVision language modelSame as LLM
EMBEDDER_MODEL_NAMEEmbedding modeljinaai/jina-embeddings-v3
RETRIEVER_TOP_KRetrieval top-K documents20
RERANKER_ENABLEDEnable rerankingtrue
RERANKER_MODELReranking modelAlibaba-NLP/gte-multilingual-reranker-base
AUTH_MODEAuthentication modetoken
WEBSEARCH_API_TOKENWeb search API (optional)Silent disable if unset

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch