A generative agent framework inspired by human dual-process theory, combining fast and slow thinking mechanisms with in-context reinforcement learning to efficiently solve complex interactive reasoning tasks.
Overview#
SwiftSage is an LLM-based agent system designed for tasks requiring complex reasoning and multi-step interactions. Its core innovation lies in mimicking human dual-process cognitive theory, implementing a dual-system architecture.
Core Architecture#
Swift Agent (Fast Thinking): Based on smaller models (e.g., Llama 3.1 8B), responsible for rapid intuitive reasoning, generating initial plans and code snippets. Prioritizes high efficiency and low cost, corresponding to human System 1 cognition.
Sage Agent (Slow Thinking): Based on larger models (e.g., Llama 3.1 405B), intervenes when Swift mode fails. Responsible for deep analysis, error correction, and complex logical reasoning, corresponding to human System 2 cognition.
Feedback Agent (Evaluation): Based on medium/large models (e.g., Llama 3.1 70B), acts as judge. Evaluates Swift/Sage outputs, providing specific feedback text and numerical rewards (1-10 scale).
Key Innovations#
In-context Reinforcement Learning (ICRL): Reinforcement learning without gradient updates. Injects historical feedback and rewards into context via prompts, guiding Agent to optimize next actions without model fine-tuning.
Python Code Executor: Built-in sandbox environment. Agent-generated actions manifest as Python code, executed by the executor to obtain results or state changes, adopting the Plan-Ground-Execute unified task paradigm.
Dynamic Switching Mechanism: Automatically switches between Swift and Sage modes based on task difficulty and Feedback scores, optimizing computational resource allocation.
Workflow#
- User submits Problem, SwiftSage initializes
- Swift Agent generates solution approach and Python code, Executor runs code and returns results
- Feedback Agent evaluates results:
- Score meets threshold (default ≥8) → Return final answer
- Score insufficient → Pass feedback to Swift Agent for next iteration (max max_iterations)
- If Swift still fails after max iterations, switch to Sage Agent for deep analysis
Use Cases#
- Complex scientific reasoning tasks (ScienceWorld benchmark)
- Multi-step mathematical problem solving (e.g., quadratic equations)
- Logic judgment and trap questions (e.g., 9.9 vs 9.11 comparison, character counting)
- Interactive applications balancing response speed and reasoning quality
Academic Achievement#
Significantly outperforms SayCan, ReAct, Reflexion on 30 ScienceWorld benchmark tasks. Paper published at NeurIPS 2023 Spotlight.
Installation & Usage#
# Install
pip install git+https://github.com/SwiftSage/SwiftSage.git
# Quick Run
swiftsage --problem "How many letter r are there in 'My strawberry is red.'?" \
--api_provider Together \
--swift_model_id meta-llama/Meta-llama-3.1-8B-Instruct-Turbo \
--feedback_model_id meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo \
--sage_model_id meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
Key Configuration Parameters#
| Parameter | Default | Description |
|---|---|---|
--api_provider | Together | API provider (Together, SambaNova, Groq) |
--max_iterations | 5 | Swift Agent max retry count |
--reward_threshold | 8 | Success reward threshold (1-10) |
--swift_temperature | 0.5 | Swift model temperature |
--start_with_sage | False | Skip Swift and use Sage directly |
Python API#
from swiftsage.agents import SwiftSage
s2 = SwiftSage(
dataset, embeddings, prompt_template_dir,
swift_config, sage_config, feedback_config,
use_retrieval=False, start_with_sage=False
)
reasoning, solution, messages = s2.solve(
problem="Solve 3x^2 + 7.15x + 4 = 0",
max_iterations=10, reward_threshold=8
)
Project Versions#
- V2 (beta): Current development version, still being refined
- V1 code: Located on
science_worldbranch, corresponding to NeurIPS 2023 paper implementation
Module Structure#
swiftsage/
├── agents/
│ ├── swiftsage.py # Main coordinator
│ ├── swift_agent.py # Swift Agent
│ ├── sage_agent.py # Sage Agent
│ └── feedback_agent.py # Feedback Agent
├── utils/
│ ├── LLMClient # LLM API wrapper
│ └── PythonExecutor # Code execution sandbox
└── prompt_templates/ # Jinja prompt templates
Core Contributors#
Bill Yuchen Lin, Yifan Song, et al.