Easily deploy Haystack pipelines and agents as REST APIs and MCP Tools. Officially maintained by deepset, featuring OpenAI-compatible endpoints, Chainlit UI integration, streaming responses, and file upload support.
Overview#
Hayhooks is the official tool from deepset for serving Haystack pipelines as production-ready APIs. It supports one-click deployment of pipelines and agents as REST APIs, with native MCP Protocol support for seamless integration with MCP clients like Cursor and Claude Desktop.
Key Features#
REST API Deployment#
- Deploy Haystack pipelines and agents as REST APIs
- Support both YAML-based and wrapper-based deployment
- Auto-generate OpenAI-compatible
/chat/completionsendpoints
MCP Protocol Support#
- Run as MCP Server, exposing each pipeline/agent as an MCP Tool
- Integrate with MCP clients like Cursor and Claude Desktop
- Control core APIs via conversation (deploy, undeploy, list, run pipelines)
Frontend Integration#
- Chainlit UI: Embedded chat frontend, zero-config, streaming output, custom components
- Open WebUI: OpenAI-compatible backend with event notifications and status feedback
Other Capabilities#
- Built-in file upload handling for RAG systems and document processing
- Multi-component streaming with
async_streaming_generatorandstreaming_generator
Typical Use Cases#
- RAG System Deployment - Quickly deploy retrieval-augmented generation pipelines as APIs
- AI Agent Servitization - Wrap Haystack agents as callable REST/MCP tools
- Multi-LLM Streaming Pipelines - Chain multiple LLM components with automatic streaming
- Document Indexing & Query - Use Elasticsearch for document indexing and semantic search
Installation & Quick Start#
# Basic installation
pip install hayhooks
# MCP support
pip install "hayhooks[mcp]"
# Chainlit UI
pip install "hayhooks[chainlit]"
# Start server
hayhooks run
# With Chainlit UI
hayhooks run --with-chainlit
Core Classes & API#
BasePipelineWrapper- Base class for pipeline wrapperssetup()- Initialize pipeline/agentrun_api()/run_api_async()- Custom HTTP POST endpointsrun_chat_completion_async()- OpenAI-compatible endpointsasync_streaming_generator()- Async streaming generatorstreaming_generator()- Sync streaming generator
HTTP API Examples#
# Custom endpoint
curl -X POST http://localhost:1416/my_agent/run \
-H 'Content-Type: application/json' \
-d '{"question": "What can you do?"}'
# OpenAI-compatible endpoint
curl -X POST http://localhost:1416/chat/completions \
-H 'Content-Type: application/json' \
-d '{"model": "my_agent", "messages": [{"role": "user", "content": "Hello"}]}'
Project Info#
- Version: v1.13.0 (Beta)
- License: Apache-2.0
- Python Version: >=3.10, <3.15
- Deployment: Docker Compose, direct Python execution