Agent Park - Agent Project Navigator

All Projects

54 projects

Clawd Cursor

✨

AI desktop agent that sees your screen, controls your cursor, and completes tasks autonomously. Features a 5-layer intelligent fallback pipeline, multiple AI providers (Anthropic/OpenAI/Ollama/Kimi), with Web Dashboard and REST API.

MultimodalAI AgentsAgent Framework

VIEW DETAILS →

Edge-Veda

✨

On-device full-stack AI SDK for Flutter with LLM, Vision, Speech, Image Gen, and RAG; features compute budget contracts and adaptive QoS with zero cloud dependency.

大语言模型MultimodalSDK

VIEW DETAILS →

NagaAgent

✨

A four-service collaborative AI desktop assistant framework with streaming tool calling, GRAG knowledge graph memory, Live2D avatar, and voice interaction

RAGMultimodalAI Agents

VIEW DETAILS →

Seline

✨

A local-first AI desktop application integrating conversational AI, visual generation, vector search, and multi-channel connectivity, featuring deep research modes and local knowledge bases.

MultimodalModel Context ProtocolRAG

VIEW DETAILS →

Roboflow Trackers

✨

A plug-and-play multi-object tracking (MOT) Python library offering modular implementations of classic algorithms like SORT and ByteTrack. Features a detector-agnostic design compatible with any object detection model (YOLO, DETR, etc.), supporting video files, cameras, RTSP streams, and more. Provides unified CLI tools and Python API with built-in evaluation metrics (CLEAR, HOTA, Identity).

MultimodalDeep LearningSDK

VIEW DETAILS →

MiniCPM-o

✨

An end-to-side omnimodal LLM by Tsinghua THUNLP supporting vision, speech, and full-duplex multimodal live streaming, optimized for mobile deployment with performance rivaling Gemini 2.5 Flash.

大语言模型MultimodalTransformers

VIEW DETAILS →

CogAgent

✨

An open-sourced end-to-end VLM-based GUI Agent developed by Tsinghua University and Zhipu AI, built on GLM-4V-9B bilingual VLM, enabling cross-platform GUI automation and reasoning via screenshots and natural language instructions.

Model & Inference Framework大语言模型Multimodal

VIEW DETAILS →

MobileAgent

✨

MobileAgent is an autonomous mobile agent framework powered by Multimodal Large Language Models (MLLM), enabling automated mobile app operations and task execution through visual perception and tool invocation.

Model & Inference Framework大语言模型Multimodal

VIEW DETAILS →

AlphaAvatar

✨

A learnable, configurable, and pluggable Omni-Avatar Assistant framework built on LiveKit, featuring real-time interaction, multimodal memory, user persona, and external tool integration.

Docs, Tutorials & ResourcesRAGMultimodal

VIEW DETAILS →

WiFi DensePose

✨

A production-ready implementation of InvisPose that enables real-time, camera-free full-body tracking through walls using commodity WiFi mesh routers and CSI signals, with advanced analytics like fall detection and multi-person tracking.

MultimodalDeep LearningDocker

VIEW DETAILS →

Per page

Page 1 / 6 · 54 total

Browse by Filters

Project Type

Filter by Domain

Filter by Product Form

All Projects

Clawd Cursor

Edge-Veda

NagaAgent

Seline

Roboflow Trackers

MiniCPM-o

CogAgent

MobileAgent

AlphaAvatar

WiFi DensePose

STAY UPDATED