AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

Click 👇🏻 to follow, the article comes from

🙋♂️ Friends who want to join the community can see the method at the end of the article for group communication.

In the current rapid development of AI technology, multi-agent AI systems are gradually becoming a popular choice for solving complex tasks. However, this also brings many challenges when debugging these systems. AGDebugger was born in this context, providing developers with a new, interactive debugging method that makes the behavior of multi-agent AI systems more transparent and controllable.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

Hello everyone, I am Si Ling Qi. I found an interesting project called AGDebugger, which is an open-source project by Microsoft mainly used for debugging multi-agent AI systems. For example, in a scenario where a group of agents collaborates like a small team to complete tasks, if something goes wrong, how do you find the problem and fix it? This is where AGDebugger comes in handy, allowing developers to clearly see how the agents work step by step and adjust their behavior at any time.

The Rise of Multi-Agent AI Systems and Debugging Challenges

Imagine you have a group of smart friends, each with their own specialties, such as some are good at searching the internet for information, some can write code, and others can handle files. Organizing these friends can accomplish particularly complex tasks, such as finding the latest data, analyzing file contents, or executing a series of actions. This is how multi-agent AI systems work; they collaborate through division of labor, using large language models (LLMs) to plan and make decisions, completing tasks step by step.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

Debugging multi-agent AI systems requires reasoning and analysis of lengthy multi-turn dialogues, where specialized agents use tools such as web browsing and code writing with the help of large language models. AGDebugger enables users to interactively debug and guide multi-agent teams to solve problems, allowing users to reset agents to earlier nodes in the workflow and edit information to test hypotheses about their behavior interactively.

But the problem arises when these agents start working and have many rounds of back-and-forth communication; if something goes wrong in the middle, such as an agent finding the wrong information or executing an incorrect plan, how can developers find the root of the problem? Traditional debugging methods, such as checking model training or correcting datasets, become insufficient. This is because the key to multi-agent systems lies in their interactions and how they complete tasks through multiple calls to language models. This requires a brand new debugging tool that can help developers understand the dialogues between agents, locate problems, and adjust their behavior in real-time.

AGDebugger: A New Tool for Debugging Multi-Agent AI

AGDebugger is designed to address these issues. It has three particularly powerful features that make debugging multi-agent systems much simpler.

  • • AGDebugger allows users to send new messages to agents at any time and view all messages exchanged between them. It’s like being able to interject at any time or review previous conversations to see how the agents arrived at their current state. For example, you can pause the agents’ work, send them new instructions, or see what they said before, making it easy to identify where the problem lies.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

AGDebugger helps users debug and guide their agent teams interactively. Users can send new messages interactively, control the flow of messages, and view the history of agent messages.
  • • The most impressive feature of AGDebugger is that it allows users to “rewind” the agents’ dialogues back to a previous point in time and modify the messages there. It’s like being able to go back in time, change an agent’s decision, and see what happens next. For example, if you find that an agent’s plan is not detailed enough, you can go back to when it made the plan, add more specific instructions, and then rerun to see the results.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

Users debug agent workflows by directly editing previous agent messages and restarting the workflow from that point. For example, they would add more specific instructions in the messages to guide the agents towards the correct outcome.
  • • AGDebugger also features an intuitive visual interface that displays the entire dialogue history and editing history. It’s like having a map that clearly shows how the agents have progressed and where you made modifications. For instance, you can see when each message was sent, who sent it, and where you made changes, making it easy to track the entire flow of the dialogue.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

The interactive overview visually summarizes the content of agent dialogues. Each reset branches the current dialogue and creates a new dialogue session presented in a new column. Users can switch message colors to indicate message types, senders, or receivers. Hovering shows message details, and clicking navigates to the full message in the message history view.

How to Use AGDebugger

AGDebugger is not only powerful but also easy to use. It is an open-source project, and you can install and use it locally by following these steps.

First, you need to clone the AGDebugger code repository from GitHub and install the relevant Python packages. The specific steps are as follows:

# Clone the repository
git clone https://github.com/microsoft/agdebugger.git
cd agdebugger

# Install frontend dependencies
cd frontend
npm install
npm run build

# Install Python packages
cd ..
pip install .

Once the installation is complete, you can use AGDebugger to debug your multi-agent system. AGDebugger is built on AutoGen, and you need to provide a Python file that exposes a function to create an AutoGen AgentChat team for debugging. For example, here is a simple script that creates a team with a single WebSurfer agent:

# scenario.py
from autogen_agentchat.teams import MagenticOneGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.agents.web_surfer import MultimodalWebSurfer
from autogen_ext.models.openai import OpenAIChatCompletionClient

async def get_agent_team():
    model_client = OpenAIChatCompletionClient(model="gpt-4o")

    surfer = MultimodalWebSurfer(
        "WebSurfer",
        model_client=model_client,
    )
    team = MagenticOneGroupChat([surfer], model_client=model_client)

    return team

Then, you can start the AGDebugger interface with the following command:

agdebugger scenario:get_agent_team

Once in the interface, you can send a <span><span>GroupChatStart</span></span> message to start the agent dialogue and begin debugging!

User Research: The Actual Performance of AGDebugger

To see how powerful AGDebugger is, researchers conducted two parts of user research. In the first part, 6 participants used AGDebugger to find two errors in the running of failed agents. The results showed that AGDebugger was indeed very useful, as participants were able to find the problems faster, and most felt that AGDebugger was much easier to use than traditional debugging methods.

In the second part of the study, 8 participants used AGDebugger to try to guide the agents to output correct answers. Although it was somewhat difficult to get the agents to output completely correct answers directly, AGDebugger helped them better understand the agents’ behavior patterns. For example, some participants found that the agents’ plans were not detailed enough, so they went back to the planning stage and added more specific instructions. Others found that the agents’ tasks were too complex, so they simplified the tasks, allowing the agents to first complete a simple sub-task. Through these methods, participants successfully guided the agents to the correct results.

AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems

In the second part of the user research, each participant used the message editing feature to assist them in debugging, with some participants even editing messages five times individually. The most common editing operation was adding more specific instructions to messages, followed by simplifying instructions and modifying the frequency of plan goals.

Improvements for AGDebugger

Although AGDebugger is already quite useful, there are still some areas for improvement. For example, some agent actions are irreversible, such as sending an email that cannot be recalled, which limits AGDebugger’s “rewind” feature. Additionally, to effectively guide the agents, developers need to have a deep understanding of the implementation details of the agents; otherwise, they may issue instructions that the agents cannot execute. Furthermore, sometimes after modifying a message, the changes in agent behavior may not be obvious, requiring developers to try multiple times to determine if the modification was effective.

Conclusion

AGDebugger provides an innovative solution for debugging and guiding multi-agent AI systems. By allowing users to edit messages exchanged between agents and explore counterfactual scenarios, AGDebugger helps developers gain deep insights into the collective behavior of agents and effectively adjust their interactions. As AI technology continues to evolve, interactive debugging tools like AGDebugger will play an increasingly important role in building more powerful and reliable multi-agent AI systems.

After reading this article, what are your thoughts? Feel free to leave a comment, and let’s discuss together. If you are already using the AutoGen framework to build agents, then AGDebugger can be used directly; if you are using other frameworks to build your own agent applications, and iterative development and operation of agents are part of your work, then the open-source code of AGDebugger (see references at the end) is valuable and can greatly facilitate our debugging experience with agent applications. Finally, feel free to join the “Awareness Flow” community to communicate and discuss with fellow members! To join, follow the “Awareness Flow” public account, click the “Community” menu, and scan the code to enter.

Related Reading

◆ Complex tasks are no longer difficult, ARMAP helps AI agents shine

◆ Models devour code, Agents reconstruct the world: When AI Agents and models co-evolve (a lengthy article)

◆ COWPILOT: A new approach to human-computer collaborative web navigation

◆ ScoreFlow: Making AI Agent collaboration smarter and more efficient

◆ PLAN-AND-ACT: A new approach to enhance AI Agent long-term task planning capabilities

◆ 🔥 Why is the success rate of Multi-Agent systems not high enough? 14 failure modes revealed

◆ STEVE: Enabling AI to intelligently control graphical interfaces

◆ 🔥 AutoAgent: Making AI agent development accessible

◆ The choice of AI intelligence: The collision and integration of API Agents and GUI Agents

◆ Exploring MovieAgent: Movie generation through Multi-Agent CoT planning

◆ 🔥 Agentic Workflows: Making workflows smarter and more flexible

◆ 🔥 Comparative analysis of open-source Agent communication protocols: MCP, ANP, Agora, agents.json, LMOS, AITP (a lengthy article)

◆ 🔥 Practical MCP Server sharing, unlocking infinite possibilities for Agents with Claude AI

◆ 2025 Financial Industry AI Tools Overview: Ten transformative forces are coming

◆ OpenAI releases new tools: Making it easier to build AI Agent systems

◆ TwinMarket: Simulating market behavior with AI Agents, unveiling the mysteries of financial markets

◆ 🔥 The future of AI agents: Investment trends in Silicon Valley, insights from Manus, and explorations of open-source projects like OWL

◆ 🔥 From Manus to OpenManus: How AI products can win the future?

◆ 🔥 If you can’t beat AI, join it: Best practices for front-end development with MGX Agent – case study

◆ 🔥 MGX, ushering in a new era of AI software development, in-depth analysis of a lengthy article

◆ A-MEM: Giving AI Agents dynamic memory organization

◆ PlanGEN: Making AI planning a smarter multi-agent framework

◆ MCTD: Unlocking the super engine of AI planning

◆ Single-agent planning: The optimal decision-making framework in multi-agent systems

◆ CODESIM: A new approach to multi-agent code generation and problem-solving

◆ Breaking tradition: A new paradigm in exploring multi-agent architecture – MaAS framework interpretation

◆ AI Agent infrastructure: Unlocking potential and managing risks

◆ 🔥 Unlocking the construction code of AI Agents: Analysis of six open-source frameworks

◆ 🔥 AFLOW: Optimizing AI with AI, opening a new chapter in efficient workflows

◆ 🔥 2025 13 free AI Agent course resources

◆ Using the PydanticAI framework to quickly build Multi-Agent systems – AI Agent collaboration made accessible

◆ 🔥 Eko: Driving front-end development with natural language, a new experience in AI Agent workflows!

◆ The “tool hand” of the next generation AI Agent: How MCP enables AI to autonomously operate databases/browsers/APIs

◆ IntellAgent: An evaluation framework for conversational AI

◆ The self-evolution path of AI: Autonomous iterative optimization of multi-agent systems

◆ 🔥 From theory to reality: OpenAI’s Operator showcases the immense potential of CCA (Computer Control Agents)

◆ AI Agent practice: Achieving persistence and streaming with LangGraph

◆ 🔥 Search-o1: Dynamic retrieval + document refinement, unlocking knowledge blind spots for AI reasoning

◆ 🔧 DeepSeek information overload? – CHRONOS: AI iteratively self-questioning, accurately constructing news timelines

◆ 🔥 Can AI learn self-reflection? Agent-R uses Monte Carlo Tree Search (MCTS) for self-training and automatic error correction, making AI smarter

◆ Setting boundaries for AI Agents: The combination of natural language permissions and structured permissions

◆ 🔥 The choice of AI implementation: Function, multi-tool Agent, or Multi-Agent?

◆ Cline 3.3 new version: The “guardian” and “efficiency pioneer” in programming

Self-MoA: Simplifying the traditional MoA by focusing on a single model to simplify LLM integration

◆ 🔥 New breakthroughs in optimizing multi-agent systems: The Mass framework leads to new ideas in intelligent collaboration

References

  • • Interactive Debugging and Steering of Multi-Agent AI Systemshttps://arxiv.org/pdf/2503.02068
  • • GitHub – microsoft/agdebuggerhttps://github.com/microsoft/agdebugger

Note: This article was translated with AI assistance, and the content was organized/reviewed by humans.

Welcome to click AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems and follow AGDebugger: A Powerful Tool for Developing, Debugging, and Guiding Multi-Agent Systems . Follow the public account ⭐️ to not miss out on exciting content

I am Si Ling Qi🐝, an internet enthusiast who loves AI. Here, I share my observations and thoughts, hoping my explorations can inspire those who also love technology and life, bringing you inspiration and reflection.

I look forward to our unexpected encounters. Click 👇🏻 to follow

🙋♂️ Join the group for communication 1. Click the “Community” in the public account menu to scan the code to join the group. 2. Reply “join group” or “add group” to add the author’s WeChat to join the group.

Leave a Comment