Persistent semantic memory layer for AI coding agents built on Qdrant, featuring cross-session context recovery, semantic decay, 3-layer security scanning, and progressive context injection.
AI-Memory is a persistent semantic memory middleware designed for AI coding agents including Claude Code, Gemini CLI, Cursor IDE, and Codex CLI. It leverages Qdrant vector database to achieve cross-session context persistence, automatically recovering previous session memory to fundamentally solve "AI amnesia."
The system organizes memory into 5 specialized Collections—code-patterns (HOW), conventions (WHAT), discussions (WHY), github (WHEN), and jira-data—covering 31 precise memory types. An exponential decay scoring mechanism applies differentiated half-lives per type (code 14d / discussions 21d / conventions 60d) for natural memory temporal management.
Context delivery uses a 3-tier progressive injection approach: session bootstrap summary → per-turn supplementation → confidence-filtered retrieval, all constrained by token budgets. Code and text use separate Jina AI embedding models (dual embedding routing), improving retrieval precision by 10-30%. Security-wise, a 3-layer pipeline (regex + detect-secrets + SpaCy NER) automatically scans for PII and secrets before storage.
Auto-triggers cover error detection, new file creation, first edits, decision keywords (20 patterns), best practice keywords (27 patterns), and more—enabling zero-intervention memory capture. GitHub sync supports AST-aware semantic indexing of 9 content types; Jira Cloud integration provides JQL query sync with tenant isolation.
Optional capabilities include: Langfuse 9-step pipeline tracing, Prometheus + Grafana monitoring dashboards, LLM-as-Judge evaluation pipeline (6 evaluators + 5 golden datasets + CI gate), Parzival AI project management agent (parallel team orchestration + adversarial review + behavior drift prevention), and knowledge discovery with automatic Claude Code skill generation. AsyncSDKWrapper provides a Python async SDK with built-in token bucket rate limiting and exponential backoff retry.
Prerequisites: Python 3.10+, Docker 20.10+, 16 GiB RAM (32 GiB with Langfuse), supporting macOS / Linux / Windows (WSL2).
Quick Start:
./scripts/stack.sh start
SEED_BEST_PRACTICES=true ./scripts/install.sh /path/to/your-project
Key Slash Commands: /aim-status (system status), /aim-search <query> (semantic search), /aim-github-sync (GitHub sync), /aim-jira-sync (Jira sync), /aim-freshness-report (stale memory scan), /pov:parzival (activate AI PM).