DISCOVER THE FUTURE OF AI AGENTS

GenericAgent

Added Apr 23, 2026
Agent & Tooling
Open Source
Workflow AutomationAI AgentsAgent FrameworkBrowser AutomationAgent & ToolingAutomation, Workflow & RPAProtocol, API & Integration

A minimal, self-evolving autonomous Agent framework that automatically consolidates task execution paths into reusable skills via layered memory

Core Positioning#

GenericAgent is a minimalist autonomous Agent framework with ~3000 lines of core code and an ~100-line agent loop, achieving multi-modal system-level control over browsers, terminals, file systems, keyboard/mouse, screen vision, and Android devices (ADB) using only 9 atomic tools + 2 memory management tools.

Self-Evolution Mechanism#

After completing each new task, execution paths are automatically consolidated into Skills stored in layered memory. Similar tasks subsequently reuse existing skills, building a richer skill tree over time. The entire repository was autonomously created by GenericAgent (including git init and all commits) without the author opening a terminal, serving as a bootstrap proof.

Layered Memory System#

LayerNameFunction
L0Meta RulesCore behavior rules and system constraints
L1Insight IndexMinimalist memory index for fast routing and recall
L2Global FactsStable knowledge accumulated over long-term operation
L3Task Skills / SOPsReusable processes for specific tasks
L4Session ArchiveArchived records of completed tasks for long-range recall

Minimal Token Consumption#

Context window maintained under 30K tokens (vs. 200K–1M for comparable frameworks), ensuring critical information presence and reducing noise through layered memory. README claims "6x less token consumption" (specific benchmark data unconfirmed).

9 Atomic Tools#

ToolFunction
code_runExecute arbitrary code
file_readRead files
file_writeWrite files
file_patchModify/patch files
web_scanPerceive web page content
web_execute_jsControl browser behavior
ask_userHuman-in-the-loop confirmation
update_working_checkpointPersist current context
start_long_term_updateLong-term memory update

Via code_run, dynamically install Python packages, write scripts, call external APIs, or control hardware — consolidating temporary capabilities into permanent tools.

LLM Compatibility#

Supports Claude/Gemini/Kimi/MiniMax and other mainstream LLMs. Interface format is distinguished via variable naming in mykey.py: oai_config (OpenAI-compatible), claude_config (Claude-compatible), native_oai_config / native_claude_config (standard tool calling for weaker models).

Frontends & Integration#

Natively provides Streamlit GUI, Qt desktop app, and bot frontends for WeChat/QQ/Feishu/WeCom/DingTalk/Telegram. Common chat commands: /new (new conversation), /continue (restore session snapshot). Advanced modes (Reflect, Plan, SubAgent, autonomous exploration, scheduled tasks) are self-documenting.

Typical Scenarios#

  • Browser automation with preserved login state
  • Automated food delivery ordering
  • Quantitative stock screening
  • Mobile device control via ADB
  • Autonomous web exploration and periodic summarization
  • Chat platform Bot integration

Installation#

git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
pip install requests streamlit pywebview
cp mykey_template.py mykey.py
# Edit mykey.py, fill in LLM API Key
python launch.pyw

Minimal CLI startup: python3 agentmain.py

Unconfirmed Information#

  • arXiv paper (2604.17091) full experimental data and benchmarks not reviewed in detail
  • "Million-level Skill library" specifics not detailed
  • "Dintal Claw" government bot has no independent link
  • Token consumption comparison lacks specific benchmark data
  • V1.0 public date marked as 2026-01-16, discrepancy with current timeline

Related Projects

View All

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.