TheAgentCompany
✨A benchmark platform for evaluating the performance of AI large language model agents on executing real-world professional tasks in a simulated software company environment, featuring diverse task roles and a comprehensive scoring system.
PythonDockerMulti-Agent System