A self-evolving virtual disease biologist developed by GENTEL-lab at Shanghai Jiao Tong University, powered by multi-agent systems and MCP protocol, integrating 600+ bioinformatics tools for automated therapeutic target discovery and molecular mechanism analysis.
Overview#
OriGene is a self-evolving virtual disease biologist developed by GENTEL-lab at Shanghai Jiao Tong University, officially released at the 2025 World Artificial Intelligence Conference (WAIC). The system addresses key challenges in drug discovery: heavy reliance on manual intuition, fragmented data sources, and lengthy analysis cycles.
Core Capabilities#
Intelligent Target Discovery: Automatically integrates 10+ authoritative databases (ChEMBL, PubChem, OpenTargets, NCBI, TCGA, DepMap, etc.) for target screening, ranking, and validation.
Self-Evolving Multi-Agent Architecture: Features self-learning and iterative optimization capabilities, supporting multi-step reasoning for complex biological problems while simulating human biologist research workflows.
Mechanism-Guided Analysis: Performs deep reasoning based on biological mechanism pathways rather than simple keyword matching, generating analysis reports with evidence chains.
Native MCP Protocol Support: Enables standardized invocation of 600+ bioinformatics tools (BLAST, ClustalW, etc.) through the OrigeneMCP server.
System Architecture#
Adopts master-slave MCP architecture:
- Main Application: Handles user interaction, task planning, agent scheduling, and report generation
- OrigeneMCP Server: Independent microservice encapsulating bioinformatics tools and database access logic
Supports multiple LLM backends (OpenAI, DeepSeek, CloseAI, etc.) for model-agnostic deployment.
Use Cases#
- Target screening and validation in early-stage drug discovery
- Molecular mechanism analysis for complex diseases
- Comprehensive biomedical literature Q&A and knowledge graph construction
- Clinical trial support and target-related trial information queries
Deployment#
Requirements: Docker Engine 20.10+ or Python 3.13+
Quick Start:
git clone https://github.com/GENTEL-lab/OriGene.git
cd OriGene
./setup.sh
Running Modes:
- Interactive:
make start - Quick Research:
make quick QUERY="your question" - Detailed Report:
make detailed QUERY="your query"
TRQA Benchmark#
Includes TRQA (Therapeutic Research Question Answering) benchmark with 1,921 expert-level questions for evaluating biomedical AI agent performance across literature selection, database queries, and short-answer formats.
Development Team#
GENTEL-lab (Shanghai Jiao Tong University), with fully open-source codebase and benchmarks.