A free, open-source course that teaches how to build AI agents capable of understanding images, text, audio and videos, connecting components through the MCP (Model Context Protocol).
One-Minute Overview#
Kubrick Course is a free, open-source project developed by The Neural Maze and Neural Bits in collaboration with Pixeltable and Opik. This course is designed for developers who want to go beyond basics and build production-ready AI systems. Through this course, you'll learn how to construct MCP multimodal agents capable of handling video tasks and understanding various modal data including images, text, audio and videos.
Core Value: Learn to build complete multimodal AI systems through hands-on implementation, integrating LLMOps best practices with comprehensive guidance from concept to production.
Quick Start#
Installation Difficulty: Medium - Requires setting up multiple components including Pixeltable, FastMCP and API models
git clone https://github.com/the-ai-merge/multimodal-agents-course.git
Core Capabilities#
1. Multimodal Data Processing#
- Use Pixeltable to build multimodal data processing pipelines and stateful agents
- Support comprehensive analysis of video, images, audio, and text
2. MCP Server Construction#
- Build complex MCP servers using FastMCP to expose resources, prompts, and tools
- Implement custom MCP clients to connect with agents
3. Prompt Version Management#
- Implement MCP prompt versioning with Opik
- Implement custom tracing and monitoring
Technical Stack#
- Python
- FastMCP
- Pixeltable
- FastAPI
- Opik
- Groq