SmartCall-Agent
✨A modular voice AI platform based on LiveKit and OpenAI Realtime API, integrating RAG knowledge retrieval, JWT authentication, and MongoDB persistence for real-time outbound calling and domain-specific conversations.
A modular voice AI platform based on LiveKit and OpenAI Realtime API, integrating RAG knowledge retrieval, JWT authentication, and MongoDB persistence for real-time outbound calling and domain-specific conversations.
On-device full-stack AI SDK for Flutter with LLM, Vision, Speech, Image Gen, and RAG; features compute budget contracts and adaptive QoS with zero cloud dependency.
An open-source framework for building, evaluating, and training general multi-agent systems. Features natural language agent creation, distributed reinforcement learning training pipeline, and complex environment interactions. Ranks top on authoritative benchmarks including GAIA, OSWorld, and VisualWebArena.
A local-first AI workspace built on Rust and Tauri, acting as an AI coworker for secure, supervised automation on any folder. Supports multiple LLM backends, MCP protocol extension, and multimodal file processing.
A lightweight yet complete LLM/Agent application development framework. Uses decorators to transform function signatures and docstrings into prompts, enabling type-safe LLM capabilities without function body implementation. Features multi-provider support, multimodal I/O, tool calling, streaming, API key load balancing, and Langfuse observability integration.
An open-source full-stack framework for autonomous computer agents, enabling control of browsers, terminals, and desktop apps via natural language in Docker VMs. Maintained by coasty-ai under Apache 2.0 license, achieving 82% on OSWorld Benchmark.
The Context Optimization Layer for LLM Applications, delivering 40-90% token reduction through deterministic compression and intelligent caching with multi-modal support and reversible CCR mechanism
A cross-platform CLI tool and browser extension that extracts and summarizes content from any URL (websites, YouTube, podcasts) or local files (PDF, audio/video, images), supporting multiple LLM providers and local models.
No-code LLM platform to launch APIs and ETL pipelines that structure unstructured documents—the Data Layer for your Agentic Workflows. Features Prompt Studio for visual prompt engineering, one-click deployment, and close to 100% accuracy with LLMChallenge verification.
An end-to-side omnimodal LLM by Tsinghua THUNLP supporting vision, speech, and full-duplex multimodal live streaming, optimized for mobile deployment with performance rivaling Gemini 2.5 Flash.
Page 1 / 3 · 30 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.