A lightweight, lightning-fast, in-process vector database powered by Alibaba's Proxima engine, featuring zero-config deployment, billion-scale millisecond search, and hybrid sparse/dense retrieval.
Introduction#
zvec is an open-source in-process vector database from Alibaba, designed for extreme performance and minimalist deployment. It requires no standalone server and embeds directly into Python applications, supporting millisecond queries over billion-scale data. Built on Alibaba's Proxima engine, it ensures industrial-grade stability.
Core Features#
- High-Performance Search: Searches billions of vectors in milliseconds
- Zero-Config Deployment: Install and start searching in seconds. No servers, no config, no fuss
- Hybrid Retrieval: Combines semantic similarity with structured filtering
- Multi-Vector Support: Native support for both Dense and Sparse vectors
- In-Process Runtime: Embeds directly into host applications with zero network latency
Installation & Quick Start#
Requirements: Python 3.10 - 3.12
Supported Platforms:
- Linux (x86_64, ARM64)
- macOS (ARM64)
Install:
pip install zvec
Usage Example#
import zvec
# Define Schema
schema = zvec.CollectionSchema(
name="example",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 4),
)
# Create and open Collection
collection = zvec.create_and_open(path="./zvec_example", schema=schema)
# Insert documents
collection.insert([
zvec.Doc(id="doc_1", vectors={"embedding": [0.1, 0.2, 0.3, 0.4]}),
zvec.Doc(id="doc_2", vectors={"embedding": [0.2, 0.3, 0.4, 0.1]}),
])
# Query by vector similarity
results = collection.query(
zvec.VectorQuery("embedding", vector=[0.4, 0.3, 0.3, 0.1]),
topk=10
)
print(results) # [{'id': str, 'score': float, ...}, ...]
Core Concepts#
- Collection: Data collection container
- CollectionSchema: Metadata definition including vector field types (e.g., VECTOR_FP32) and dimensions
- Doc: Document object containing ID and vector data
- VectorQuery: Query object encapsulating query vector and TopK parameters
Typical Use Cases#
- RAG / Retrieval-Augmented Generation: Local knowledge base retrieval layer for LLMs
- Edge Computing / On-Device AI: Vector search in resource-constrained or offline environments
- Semantic Search & Recommendation: High-performance vector retrieval with metadata filtering
- Rapid Prototyping: Test vector models in Notebooks without setting up servers
Architecture#
- In-Process Architecture: Compiled to native machine code, loaded directly into application process space via language bindings
- Local File Storage: Data persistence directory specified via
pathparameter - Underlying Engine: Built on Proxima (Alibaba's battle-tested vector search engine)