AI Agents Pioneer CAMEL: The First Large Model Multi-Agent Framework

MLNLP community is a well-known machine learning and natural language processing community at home and abroad, covering domestic and foreign NLP master’s and doctoral students, university teachers, and enterprise researchers.

The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning at home and abroad, especially for the progress of beginners.

Reprinted from | Jiangmen Venture Capital

AI Agents is a highly discussed topic in the field of large models, allowing users to introduce multiple LLM Agents playing different roles to participate in actual tasks, where agents engage in various forms of dynamic interaction such as competition and cooperation, resulting in remarkable collective intelligence effects. This article introduces the CAMEL framework for large model mental interaction from the KAUST research team, which is one of the earliest well-known projects based on ChatGPT’s autonomous agents, and has been accepted by the top AI conference NeurIPS 2023.

AI Agents Pioneer CAMEL: The First Large Model Multi-Agent Framework

Paper Title:

CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society

Paper Link:

https://ghli.org/camel.pdf

Code Link:

https://github.com/camel-ai/camel

Project Homepage:

https://www.camel-ai.org/

“What magical trick makes us intelligent? The trick is that there is no trick. The power of wisdom comes from our tremendous diversity, not from any single, perfect principle.” — AI pioneer Marvin Minsky [1]

Currently, large models (LLMs), represented by ChatGPT, should be one of the milestones on the path to advanced intelligence for machines, achieving remarkable achievements in solving complex tasks in various fields through human-computer interaction in the form of chat dialogue. With the development of LLMs, the interactive framework between AI Agents is gradually emerging, especially in some complex professional fields, where agents preset in role-playing modes are fully capable of replacing human users in their roles during tasks. Meanwhile, the dynamic interactions between agents through cooperation and competition often bring unexpected results, which is seen by OpenAI AI experts such as Andrej Karpathy as “the most important frontier research direction towards AGI.”

The timeline of the development of this field is as follows [2]:

“CAMEL” (Camel: Large Model Mental Interaction Framework) – Released on 2023.3.21
“AutoGPT” – Released on 2023.3.30
“BabyGPT” – Released on 2023.4.3
“Westworld” simulation — Released on 2023.4.7

As one of the earliest well-known projects based on ChatGPT’s autonomous agents, CAMEL focuses on exploring a new cooperative agent framework called role-playing, which can effectively alleviate errors that occur during agent dialogues, thus guiding agents to complete various complex tasks effectively. Human users only need to input an initial idea to initiate the entire process. Currently, CAMEL has been accepted by the international top AI conference NeurIPS 2023.

The author has designed flexible modular functions for the CAMEL framework, including the implementation of different agents, various professional domain prompt examples, and AI data exploration frameworks, making CAMEL a foundational backend for agents that supports AI researchers and developers in more easily developing applications related to multi-agent systems, cooperative artificial intelligence, game theory simulation, social analysis, and AI ethics. Specifically, the author generated two large instruction datasets, AI Society and AI Code, as well as two single-turn question-and-answer datasets, AI Math and AI Science, through two cooperative scenarios involving role-playing to explore the emergent capabilities of LLMs.

1. CAMEL Framework

The following image shows the role-playing framework in CAMEL, where human users first need to formulate an idea or goal they want to achieve, for example: developing a trading robot for the stock market. The roles involved in this task are the AI assistant agent (playing the role of a Python programmer) and the AI user agent (playing the role of a stock trader).

The author first sets up a task specifier for CAMEL, which formulates a more detailed implementation step based on the input idea. Then, the AI assistant agent (AI Assistant) and the AI user agent (AI User) collaborate through chat communication to complete the specified tasks step by step. Collaborative communication is achieved through a system-level message-passing mechanism, where is the system message passed to the AI assistant agent, and is the system message passed to the AI user agent. Subsequently, both AI assistant agent and AI user agent are instantiated as two ChatGPT models and , respectively, resulting in the AI assistant agent and the AI user agent .

Once the role allocation is completed, the AI assistant agent and AI user agent will collaborate to complete tasks in a directive-following manner, where represents the user directive message received at time , and represents the solution provided by the AI assistant agent, thus the dialogue message set received at time is:

At the next moment , the AI user agent will generate new directives based on the historical dialogue message set . Then, the new directive message will be passed along with the historical dialogue message set to the AI assistant agent to generate a new solution at the next moment:

For more technical details, please refer to our previous report on CAMEL.

2. CAMEL Usage Examples

2.1 Cooperative Role-Playing

The built-in cooperative role-playing framework in CAMEL can complete complex tasks through collaboration between agents without requiring human users to possess specialized knowledge. The following image shows an example of CAMEL developing a trading robot for the stock market, where the role played by the AI assistant agent is a Python programmer, and the AI user agent plays the role of a stock trader.

In the role-playing framework, AI agents possess specific domain expertise. At this point, we only need to specify a prompt for an initial idea, and then the two AI agents will work around this idea. In the above image, the user agent proposes that the trading robot needs a sentiment analysis function for stock comments, and then the assistant agent directly provides the script for installing the sentiment analysis and stock exchange required Python libraries.

As the task progresses, the instructions given by the user agent become more and more specific. The instruction in the above image is: define a function to use the Yahoo Finance API to obtain the latest stock price. The assistant agent will directly generate a piece of code to meet the requirement based on this instruction.

2.2 Embodied Agents

In previous studies, AI Agents can be understood as simulating some operations without interacting with the real world or using external tools to perform operations. Current LLMs have the ability to interact with the internet or other tool APIs, and CAMEL also provides embodied agents that can perform various operations in the physical world. They can browse the internet, read documents, create images, audio, and video content, and even execute code directly.

The above image shows an example of CAMEL generating an image of a camel family using the Stable Diffusion toolchain provided by HuggingFace through an embodied agent. In this process, the embodied agent first infers all the animals contained in the camel family, then calls the diffusion model to generate the image and save it.

2.3 Critic-in-the-Loop

To enhance the controllability of the role-playing framework, the author team has also designed a critic-in-the-loop for CAMEL. This mechanism is inspired by the Monte Carlo Tree Search (MCTS) method, which can incorporate human preferences to implement decision logic for tree search to solve tasks. CAMEL can set up an intermediate evaluation agent (critic) to make decisions based on various viewpoints provided by the user agent and assistant agent to complete the final task, as shown in the overall process in the image below.

Consider a scenario where we let CAMEL host a very specific research project discussion meeting, with the research project theme being “large language models.” CAMEL can set the user agent’s role as a postdoctoral researcher, the assistant agent’s role as a doctoral student, and the intermediate evaluation agent’s role as a professor. The task instructs the doctoral student to help the postdoctoral researcher formulate a research plan, focusing on the ethics of large models.

After receiving the task, the postdoctoral agent first throws out three viewpoints regarding this project, indicating that the project should first focus on researching relevant work on the ethics of large models. Subsequently, the professor agent will provide its opinions based on these three viewpoints, considering viewpoint 2 to be the most reasonable, which is to study discriminatory algorithms of large models. It will also point out the deficiencies of the other two viewpoints, such as viewpoint 1 lacking a clearer structure, and viewpoint 3 having too narrow a research scope, etc.

After the professor speaks, the doctoral student agent will carry out more specific project planning, such as directly listing relevant literature on the ethics and safety of large models, and discussing how to conduct specific research.

3. Experimental Results

The performance evaluation in this article is mainly conducted from three aspects, using two gpt-3.5-turbo as experimental agents. The experimental datasets use four AI datasets generated by the CAMEL framework, where AI Society and AI Code focus on the dialogue effects of agents, while AI Math and AI Science focus on the problem-solving abilities of agents.

3.1 Agent Evaluation

In this part, the author randomly selects 100 tasks from the AI Society and AI Code datasets for evaluation, and then conducts comparative experiments using the CAMEL framework and a single gpt-3.5-turbo. The evaluation results are divided into two parts: on one hand, human subjects provide 453 voting data on the solutions given by the two methods to determine which solution is more feasible. On the other hand, the author prompts the GPT-4 model to directly score the two solutions, with specific comparative data shown in the table below.

From the table above, it can be seen that the solutions provided by the CAMEL framework significantly outperform those given by gpt-3.5-turbo in both human evaluation and GPT-4 evaluation, with the overall trends of human evaluation and GPT-4 evaluation being highly consistent.

3.2 Using GPT-4 to Evaluate ChatBots

In this part, the author conducted step-by-step fine-tuning on the LLaMA-7B model using the four datasets generated by CAMEL. By continuously injecting knowledge from different fields such as society, code, mathematics, and science into the LLM, the author observed the model’s acceptance of knowledge discovery. The author started with the AI Society dataset, allowing the model to understand human interaction common sense and social dynamics, followed by the injection of the AI Code and other datasets, leading the model to acquire knowledge of programming logic and syntax, while broadening the model’s understanding of scientific theories, empirical observations, and experimental methods.

The table above displays the model’s testing performance on 20 Society tasks, 20 code writing tasks, 20 mathematics tasks, and 60 science tasks. It can be seen that each time a dataset is added, the model performs better in the already trained task domains.

3.3 HumanEval

To further evaluate the code writing task-solving ability of the CAMEL framework, the author conducted experiments on the HumanEval and HumanEval+ evaluation benchmarks. The experimental results are shown in the table below.

The table clearly demonstrates the excellent performance of the CAMEL framework, which not only far exceeds the LLaMA-7B model but also significantly surpasses the Vicuna-7B model. This indicates that the datasets generated by CAMEL have a unique effect in enhancing LLMs’ handling of coding-related tasks.

4. CAMEL AI Open Source Community

It is worth mentioning that the CAMEL author team is building a very comprehensive CAMEL AI open-source community. The community’s GitHub repository has received over 3600 stars, covering various implementations of agents in CAMEL, data generation pipelines, data analysis tools, and generated datasets to support research on AI Agents and other aspects. The community has currently attracted many open-source enthusiasts to contribute code.

Since the first line of code was written for the CAMEL project, it has been 9 months, and the CAMEL-AI.org open-source research technology community has attracted over 20 independent code contributors from KAUST/Cambridge/Sorbonne/NUS/CMU/University of Chicago/Stanford/Duke University/Peking University/Shanghaitech/Harbin Institute of Technology/Xidian University/Northeast University/Chengxin University and the industry. The community is looking for full-time/part-time/intern contributors, engineers, and researchers to join in learning and exploring how to push the boundaries of building intelligent agent societies. Outstanding contributors will have the opportunity to participate in writing and submitting papers for frameworks and other research projects.

If you are interested in joining the CAMEL-AI.org community, you can send your resume to [email protected] or add WeChat CamelAIOrg for consultation!

References

[1] Minsky M. Society of mind[M]. Simon and Schuster, 1988.

[2]https://towardsdatascience.com/4-autonomous-ai-agents-you-need-to-know-d612a643fa92

Illustration From IconScout By TanahAir Studio

Technical Exchange Group Invitation

AI Agents Pioneer CAMEL: The First Large Model Multi-Agent Framework

△Long press to add assistant

Scan the QR code to add assistant WeChat

Please note: Name-School/Company-Research Direction

(e.g., Xiaozhang-Harbin Institute of Technology-Dialogue System)

To apply for joining technical exchange groups such as Natural Language Processing/Pytorch

About Us

MLNLP community is a grassroots academic community jointly built by domestic and foreign scholars in machine learning and natural language processing. It has now developed into a well-known machine learning and natural language processing community at home and abroad, aiming to promote progress between the academic and industrial circles of machine learning and natural language processing and the vast number of enthusiasts.

The community can provide an open exchange platform for related practitioners’ further education, employment, and research. Everyone is welcome to follow and join us.

AI Agents Pioneer CAMEL: The First Large Model Multi-Agent Framework

1. CAMEL Framework

2. CAMEL Usage Examples

2.1 Cooperative Role-Playing

3. Experimental Results

4. CAMEL AI Open Source Community

About Us

Related posts

Leave a Comment Cancel reply