An AI-powered assistant service stack built on FastAPI, integrating multiple LLM providers (OpenAI, Azure, VertexAI, WatsonX, vLLM) via Llama Stack, with support for MCP tool calling, RAG configuration, streaming queries, and enterprise Kubernetes deployment.
Lightspeed Core Stack (LCS) is an AI-powered assistant that provides answers to product questions using backend LLM services, agents, and RAG databases.
Key Features#
- Multi-LLM Provider Support: OpenAI (gpt-5, gpt-4o, o1, o3, o4), Azure OpenAI, Google VertexAI (gemini-2.0-flash, gemini-2.5-pro), IBM WatsonX, vLLM
- MCP Server Integration: Static file token, K8s Service Account, client token, OAuth, automatic header propagation
- RAG Configuration: Vector store and retrieval-augmented generation
- User Data Collection: Feedback data and conversation transcript storage, export to Red Hat Dataverse
- Safety Shields: Input/output stream monitoring
- System Prompts: Path reference / literal / custom profile configuration
Deployment Modes#
| Mode | Description |
|---|---|
| Server Mode | Llama Stack as standalone service, LCS connects via REST API |
| Library Mode | Llama Stack embedded in LCS process |
REST API Endpoints#
| Endpoint | Method | Description |
|---|---|---|
/v1/query | POST | Non-streaming query |
/v1/streaming-query | POST | Streaming query |
/v1/models | GET | List available models |
/v1/readiness | GET | Readiness check |
/v1/liveness | GET | Liveness check |
Quick Start#
# PyPI Installation
pip install lightspeed-stack
# Container Deployment
podman pull quay.io/lightspeed-core/lightspeed-stack:latest
podman run -it -p 8080:8080 \
-v my-config.yaml:/app-root/lightspeed-stack.yaml:Z \
quay.io/lightspeed-core/lightspeed-stack:latest
Configuration Example#
name: lightspeed-service
service:
host: localhost
port: 8080
auth_enabled: false
llama_stack:
use_as_library_client: false
url: http://localhost:8321
user_data_collection:
feedback_enabled: true
transcripts_enabled: true
mcp_servers:
- name: "filesystem-tools"
url: "http://localhost:9000"
Project Info#
- Primary Language: Python 93.5% (requires 3.12 or 3.13)
- Framework: FastAPI + Uvicorn
- License: Apache-2.0
- Container Image: quay.io/lightspeed-core/lightspeed-stack