Vision-Agents
✨An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
Blades is a multimodal AI Agent framework for the Go language, supporting custom models, tools, memory, middleware, and more. It's designed for multi-turn conversations, chain-of-thought reasoning, and structured output applications.
A Python library for orchestrating zero-shot computer vision models, enabling custom end-to-end pipeline creation without needing to collect and annotate large training datasets.
A tool that gracefully solves hCaptcha challenges using multimodal large language models, without relying on browser extensions or third-party captcha services.
A Python library for building multimodal language agents with ease, wrapping complex engineering behind a simple interface while supporting multiple modalities including text, images, videos, and audio.
Page 1 / 1 · 5 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.