A function-level code dependency graph CLI for AI agents, supporting 34 languages, MCP server, hybrid semantic search, CI quality gates, and architecture boundary governance — fully local and zero-config.
Codegraph is a code intelligence command-line tool designed for AI agents and developers. Its core capability is building function-level dependency graphs for codebases and providing rich structural querying and analysis based on those graphs.
Core Capabilities#
- Function-level dependency graph: Parses entire codebases to build call relationship graphs for functions/classes/methods, stored in a local SQLite database (
.codegraph/graph.db) - Broad language support: Covers 34 programming languages including JS/TS, Python, Go, Rust, Java, C#, C/C++, Kotlin, Swift, and more
- MCP server: Provides 30+ tool interfaces for AI agents to directly query the graph for context, replacing inefficient grep/find/cat call chains, with multi-repository mode support
- Hybrid semantic search: Fuses BM25 keyword retrieval with embedding-based semantic retrieval via RRF, with embedding models running fully locally (default nomic-embed-text-v1.5)
- Git diff impact analysis: Calculates function-level blast radius for staged/unstaged changes, combined with co-change analysis to discover historical coupling
Analysis & Metrics#
- Complexity metrics: Cognitive complexity, cyclomatic complexity, nesting depth, Halstead metrics, maintainability index
- Data flow & CFG analysis: Intra-procedural parameter tracking, return value consumer identification, mutation detection; CFG output in text/DOT/Mermaid formats
- Node role classification: Automatically labels symbols as entry/core/utility/adapter/dead/leaf
- Community detection: Leiden clustering to discover natural module boundaries and architectural drift
Governance & Engineering Integration#
- CI quality gates:
check --stagedwith configurable complexity thresholds and blast radius rules, exit code 0/1 for pipeline integration - Architecture boundary rules: User-defined inter-module dependency constraints with built-in onion architecture presets
- Dead code detection: Quickly locate unreferenced non-exported symbols via role classification
- Graph snapshots: save/restore for lightweight backup and rollback verification during refactoring
Architecture Design#
The processing pipeline follows Source Files → tree-sitter Parse → Extract Symbols → Resolve Imports → SQLite DB → Query. A dual-engine parsing strategy is employed: the native Rust path uses napi-rs bindings with rayon multi-core parallelism (~3.2 ms/file build speed); the WASM path serves as fallback (~16.3 ms/file), with both producing identical output. Incremental rebuilding uses a three-layer progressive strategy (journal → mtime+size → hash) for sub-second updates. Only 3 core runtime dependencies (better-sqlite3, commander, web-tree-sitter); semantic search and MCP are optional lazy-loaded modules.
Use Cases#
- AI coding agents querying impact scope before modifications to avoid cascading breakage
- Automated exposure of structural issues (dead code, circular dependencies, boundary violations) during code review
- Setting complexity and blast radius thresholds as quality gates in CI/CD pipelines
- Rapid understanding of unfamiliar codebase module structure and call relationships
- Impact verification and safe rollback before and after refactoring