Vision-Agents
✨An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
Blades is a multimodal AI Agent framework for the Go language, supporting custom models, tools, memory, middleware, and more. It's designed for multi-turn conversations, chain-of-thought reasoning, and structured output applications.
A Python library for orchestrating zero-shot computer vision models, enabling custom end-to-end pipeline creation without needing to collect and annotate large training datasets.
An open-source intelligent assistant framework for mobile devices that understands screen content through multimodal methods and performs automated operations to help users complete tasks.
A tool that gracefully solves hCaptcha challenges using multimodal large language models, without relying on browser extensions or third-party captcha services.
Nekro Agent is an extensible multi-person interactive agent framework that combines code execution capabilities with high extensibility, featuring sandbox-driven architecture, visual interface, and multimodal interaction support across multiple platforms.
A groundbreaking visual AI development environment for building no-code data pipelines and multimodal agents with real-time capabilities, social connectors, and AI-powered tools.
A curated collection showcasing what's possible with ChatGPT Code Interpreter, featuring experiments that push boundaries and unlock creative potential through the combination of AI and code execution.
Director is an AI video agents framework that can reason through complex video tasks like search, editing, compilation, generation etc & instantly stream results. It's built on VideoDB's 'video-as-data' infrastructure.
AIlice is a fully autonomous, general-purpose AI agent based on open-source LLMs. Using its unique Interactive Agents Call Tree (IACT) architecture, it decomposes complex tasks into dynamically constructed agents with high fault tolerance, enabling seamless task execution and result integration.
Page 1 / 2 · 15 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.