Agent Park - Agent Project Navigator

All Projects

17 projects

Clawd Cursor

✨

AI desktop agent that sees your screen, controls your cursor, and completes tasks autonomously. Features a 5-layer intelligent fallback pipeline, multiple AI providers (Anthropic/OpenAI/Ollama/Kimi), with Web Dashboard and REST API.

MultimodalAI AgentsAgent Framework

VIEW DETAILS →

Seline

✨

A local-first AI desktop application integrating conversational AI, visual generation, vector search, and multi-channel connectivity, featuring deep research modes and local knowledge bases.

MultimodalModel Context ProtocolRAG

VIEW DETAILS →

CogAgent

✨

An open-sourced end-to-end VLM-based GUI Agent developed by Tsinghua University and Zhipu AI, built on GLM-4V-9B bilingual VLM, enabling cross-platform GUI automation and reasoning via screenshots and natural language instructions.

Model & Inference Framework大语言模型Multimodal

VIEW DETAILS →

MobileAgent

✨

MobileAgent is an autonomous mobile agent framework powered by Multimodal Large Language Models (MLLM), enabling automated mobile app operations and task execution through visual perception and tool invocation.

Model & Inference Framework大语言模型Multimodal

VIEW DETAILS →

FilmAgent

✨

FilmAgent is a multi-agent collaborative system for end-to-end film automation in 3D virtual spaces. It simulates key crew roles—directors, screenwriters, actors, and cinematographers—and integrates efficient human workflows within a sandbox environment.

Agent & ToolingPythonC#

VIEW DETAILS →

Open-AutoGLM

✨

An open-source intelligent assistant framework for mobile devices that understands screen content through multimodal methods and performs automated operations to help users complete tasks.

Agent & ToolingPythonAgent Framework

VIEW DETAILS →

JarvisArt

✨

JarvisArt is a multi-modal large language model (MLLM)-driven agent for intelligent photo retouching. It liberates human creativity by understanding user intent, mimicking professional artist reasoning, and coordinating over 200 tools in Adobe Lightroom.

Agent & ToolingPythonAI Agents

VIEW DETAILS →

ScreenAgent

✨

A computer control agent driven by visual language large models that enables AI to interact with GUIs by observing screenshots and outputting mouse and keyboard operations, completing multi-step tasks.

Agent & ToolingPythonPyTorch

VIEW DETAILS →

SeeAct

✨

SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, focusing on large multimodal models (LMMs) like GPT-4V. It consists of a robust codebase for running web agents on live websites and an innovative framework that utilizes LMMs as generalist web agents.

Agent & ToolingPythonPlaywright

VIEW DETAILS →

Magick

✨

A groundbreaking visual AI development environment for building no-code data pipelines and multimodal agents with real-time capabilities, social connectors, and AI-powered tools.

Agent & ToolingDockerPostgreSQL

VIEW DETAILS →

Per page

Page 1 / 2 · 17 total

Browse by Filters

Project Type

Filter by Domain

Filter by Product Form

All Projects

Clawd Cursor

Seline

CogAgent

MobileAgent

FilmAgent

Open-AutoGLM

JarvisArt

ScreenAgent

SeeAct

Magick

STAY UPDATED