An enterprise-grade, API-first LLM workspace for unstructured documents, featuring data extraction, redaction, rights management, prompt playground, and more! Self-hosted, AI-powered, and built for teams who need to own their data.
One-Minute Overview#
OpenContracts is an open-source document intelligence platform designed for users who need control over their data security and team collaboration. It combines document management, AI analysis, and collaboration features, supporting structured extraction and annotation of PDFs and text files, enabling users to interact with documents through conversational AI assistants.
Core Value: Provides complete document lifecycle management in a self-hosted environment, from uploading and analyzing to extracting data, all within your controlled data environment.
Quick Start#
Installation Difficulty: Medium - Requires Docker and basic database knowledge, though the project provides comprehensive deployment guides
# Development quick start
git clone https://github.com/JSv4/OpenContracts.git
cd OpenContracts
docker compose -f local.yml up
Is this suitable for me?
- ✅ Legal Document Review: Efficiently annotate and extract key clauses from contracts
- ✅ Batch Document Analysis: Supports processing and analysis of hundreds of documents
- ✅ Team Collaboration Review: Provides discussion threads, mention functionality, and version control
- ❌ Simple Text Processing: May be overly complex for basic document processing without AI enhancements
- ❌ Mobile Requirements: Primarily designed for desktop browsers, with limited mobile support
Core Capabilities#
1. Intelligent Document Processing - From Unstructured to Structured#
- Automatically extracts structure and content from PDFs and text files using ML-based parsers for high accuracy
- Generates vector embeddings supporting semantic search to quickly locate relevant content Real Value: Transforms messy documents into searchable, analyzable structured data, significantly improving document processing efficiency
2. Advanced Annotation & Analysis - Building Knowledge Systems#
- Supports multi-page annotation with custom label schemas and relationship mapping between documents
- Provides structured data extraction interface for reviewing and validating extracted results Real Value: Converts personal and team document knowledge into reusable analytical assets supporting complex business requirements
3. AI Document Assistant - Intelligent Interaction & Analysis#
- AI agents built on PydanticAI for real-time conversations with documents
- Supports search, document loading, and annotation queries with real-time streaming responses Real Value: No need to manually search through large volumes of documents; the AI assistant can quickly answer questions, summarize content, and improve work efficiency
4. Team Collaboration Platform - Collective Intelligence Hub#
- Supports threaded discussions at global, corpus, and document levels
- Includes mention functionality, voting systems, and reputation tracking for high-quality collaboration Real Value: Breaks down information silos in document review, allowing team members to discuss specific content and reach consensus
5. Data Extraction & Export - Turning Information into Data#
- Define extraction schemas with multiple question types
- Run extractions across document collections, validate results, and export in structured formats Real Value: Transforms unstructured document content into structured data usable for analysis, reporting, and integration
Tech Stack & Integration#
Development Languages: Python (backend), JavaScript/TypeScript (frontend) Key Dependencies:
- Django (backend framework)
- React (frontend framework)
- PydanticAI (LLM integration)
- pgvector (vector storage)
- Docling/NLM-Ingest (document parsing)
Integration Method: API-first architecture supporting integration via API or SDK into existing workflows
Maintenance Status#
- Development Activity: Actively developed, recently released v3.0.0.b3 with new collaboration features
- Recent Updates: Recent active development including new features and bug fixes
- Community Response: Has an active contributor community developing under the open source license
Commercial & Licensing#
License: AGPL-3.0
- ✅ Commercial Use: Allowed with compliance to AGPL terms
- ✅ Modification: Allowed to modify and distribute
- ⚠️ Restrictions: If providing services over a network, source code must be made available
Documentation & Learning Resources#
- Documentation Quality: Comprehensive, including architecture guides, deployment guides, API docs, and tutorials
- Official Documentation: https://jsv4.github.io/OpenContracts/
- Example Code: Includes multiple usage examples and custom component development guides