An open-source AI research assistant that connects any LLM to your internal knowledge sources and enables real-time team collaboration through chat, serving as an alternative to NotebookLM, Perplexity, and Glean.
One-Minute Overview#
SurfSense is an open-source AI research assistant that lets you connect any large language model to your internal knowledge sources and enable real-time team collaboration through chat. As an alternative to NotebookLM, Perplexity, and Glean, it supports uploading 50+ file formats, knowledge base searching, citation-based answers, and offers both local LLM support and self-hosting options.
Core Value: Connect any LLM to your internal knowledge sources for real-time team collaboration with citation-based answers.
Quick Start#
Installation Difficulty: Medium - Offers multiple deployment options including one-click Docker and full Docker Compose production setup.
# Quick Docker deployment
docker run -d -p 3000:3000 -p 8000:8000 \
-v surfsense-data:/data \
--name surfsense \
--restart unless-stopped \
ghcr.io/modsetter/surfsense:latest
Is this suitable for me?
- ✅ Knowledge Management Teams: Need to connect internal documents, Slack messages, Jira tickets into a searchable knowledge base
- ✅ Research Analysts: Require fast search and query capabilities across extensive document collections
- ❌ Simple Personal Use: If you only need basic document summarization, this may be overly complex
Core Capabilities#
1. Multi-format File Upload - 50+ File Formats#
Supports various formats including documents, images, and videos to save content to your personal knowledge base. Real Value: Upload any type of work material without format conversion to establish a comprehensive knowledge base
2. Knowledge Base Search - Hybrid Search Technology#
Uses semantic + full-text hybrid search with date filtering and connector-specific queries. Real Value: Quickly find information regardless of content format or structure
3. Cited Answers - Perplexity-style Responses#
Get natural language answers with source citations for verifiability. Real Value: Enhance answer credibility with traceable information sources
4. Team Collaboration with RBAC - Role-Based Access Control#
Implements role-based access control for search spaces with customizable team roles. Real Value: Securely share knowledge bases while maintaining granular document access controls
5. Podcast Generation - Fast Audio Content Creation#
Generate engaging audio podcasts from chat conversations or knowledge base content. Real Value: Transform research content into shareable audio format
Tech Stack & Integration#
Development Languages: Python (backend), TypeScript (frontend) Main Dependencies: FastAPI, PostgreSQL (with pgvector), Next.js, React, LangChain, Deep Agents Integration Method: Browser extension, API integration, supports processing of 50+ file formats
Ecosystem & Extensions#
- Plugins/Extensions: Browser extension for saving web pages, including those behind authentication
- Integration Capabilities: Connects to multiple external sources including Google Drive, Slack, Microsoft Teams, Jira, Confluence, Notion, Gmail, YouTube, GitHub, and more
Maintenance Status#
- Development Activity: Actively developed, though not yet production-ready, with ongoing iterations
- Recent Updates: Frequent releases with new features and improvements
- Community Response: Active Discord community for support and contribution
Commercial & Licensing#
License Type: Not explicitly specified
- ✅ Commercial Use: Open source project, typically allowed for commercial use
- ✅ Modification: Open source project, typically allows modification
- ⚠️ Restrictions: Specific license terms to be confirmed
Documentation & Learning Resources#
- Documentation Quality: Moderate, with installation guides and basic instructions
- Official Documentation: Detailed API documentation available (http://localhost:8000/docs)
- Sample Code: Docker deployment examples and custom configuration instructions provided