DISCOVER THE FUTURE OF AI AGENTSarrow_forward

JarvisArt

calendar_todayAdded Jan 27, 2026
categoryAgent & Tooling
codeOpen Source
PythonWorkflow Automation大语言模型MultimodalAI AgentsAgent & ToolingAutomation, Workflow & RPAComputer Vision & Multimodal

JarvisArt is a multi-modal large language model (MLLM)-driven agent for intelligent photo retouching. It liberates human creativity by understanding user intent, mimicking professional artist reasoning, and coordinating over 200 tools in Adobe Lightroom.

One-Minute Overview#

JarvisArt, accepted to NeurIPS 2025, is an intelligent photo retouching agent that controls 200+ professional tools through natural language. It allows users to perform professional-level photo editing by simply conversing with an AI agent, eliminating the need for expertise in complex editing software.

Core Value: Transforms complex professional photo editing into natural language interactions, dramatically lowering the barrier to professional retouching.

Getting Started#

Installation Difficulty: Medium - Requires basic Python and machine learning knowledge, but offers complete Gradio Demo and online demonstrations

# Gradio Demo setup
# For specific steps, please refer to the Gradio Demo section in the README

Is this suitable for my scenario?

  • ✅ Professional photographers/retouchers: Automate complex retouching workflows to improve efficiency
  • ✅ Photography enthusiasts: Achieve high-quality image adjustments without professional editing skills
  • ✅ Image researchers: Useful for research in image processing and editing algorithms
  • ❌ Commercial applications: Explicitly prohibited by the project license

Core Capabilities#

1. Multi-granularity Retouching Control#

  • Supports editing goals at various levels, from scene-level adjustments to region-specific refinements Actual Value: Users can flexibly control editing scope, achieving perfect balance between global optimization and local adjustments

2. Natural Language Interaction#

  • Perform intuitive, free-form edits through text prompts and bounding boxes Actual Value: Transforms professional retouching knowledge into natural language descriptions, lowering usage barriers

3. Professional Tool Coordination#

  • Coordinates over 200 professional tools in Adobe Lightroom to execute retouching tasks Actual Value: Access to professional-grade editing capabilities without needing to master complex Lightroom operations

4. Innovative Training Framework#

  • Employs a two-stage training framework: Chain-of-Thought supervised fine-tuning + Group Relative Policy Optimization for Retouching (GRPO-R) Actual Value: Ensures the model possesses professional-level reasoning and decision-making capabilities

5. Multi-scenario Adaptation#

  • Supports various application scenarios including global and local retouching Actual Value: Meets different editing needs, from overall style adjustments to local detail refinements

Technical Stack & Integration#

Development Languages: Python (specific dependencies need code inspection)

Key Dependencies: Multi-modal LLM frameworks, Adobe Lightroom integration protocols

Integration Method: API/SDK/Protocol - Provides Agent-to-Lightroom Protocol for seamless integration with Adobe Lightroom

Maintenance Status#

  • Development Activity: Very active, with continuous releases from June to December 2025
  • Recent Updates: Released MMArt-Bench dataset and training scripts in December 2025
  • Community Response: Offers WeChat discussion groups for active user feedback collection

Commercial & Licensing#

License: Apache License 2.0 (modified version)

  • ✅ Commercial Use: Prohibited (explicitly forbidden)
  • ✅ Modification: Allowed (under Apache 2.0 terms)
  • ⚠️ Restrictions: Any commercial application requires explicit written permission from the authors

Documentation & Learning Resources#

  • Documentation Quality: Comprehensive
  • Official Documentation: https://github.com/LYL1015/JarvisArt
  • Example Code: Complete (inference code, training scripts, data scripts, evaluation code)
  • Tutorial Resources: Gradio Demo, online demo, Agent-to-Lightroom Protocol documentation, training guide

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch