-
Are the GPTs launched by OpenAI considered Agents? Why have they rendered many AI agent projects ineffective for half a year?
-
Many people say that GPTs are not true AI Agents, so why are they called Agent killers?
-
During the 100+ hours of internal conflict at OpenAI, with over 20,000 GPTs created, will they really become the killers of AI Agents?
-
What exactly are GPTs? How do they differ from AI Agents? Why is it said that they will kill AI Agents?
-
Why do the “quasi-Agents” GPTs have such a significant impact on true AI Agents? Will they really kill AI agents?
-
Claiming that GPTs will kill AI Agents is somewhat alarmist; the future ecosystem of Agents is bound to flourish.
The full text is approximately 6200 words, reading time 10 minutes Written by Wang Jiwei
The internal conflict at OpenAI has reached a conclusion, but the echoes of the dramatic events still linger.
The ups and downs of the plot, superb performances, and cameo appearances by Silicon Valley moguls have created a historical drama of AGI development that is worthy of record. The characters in the drama have been given more legendary colors, such as Sam Altman, who was once seen as the Steve Jobs ousted by the board, and Ilya Sutskever, who has been labeled as the “guardian of AGI”.
The true root of this internal conflict is still being speculated in the market. One of the most likely reasons is that GPT has developed into an AI that can threaten humanity, prompting guardian Ilya to intervene and use various means to prevent OpenAI from accelerating its growth with significant risks.
Sam wants to push OpenAI forward through commercial means, while Ilya wants to ensure that AI develops in a controllable manner under regulation.
Both are believers in AGI, but their development philosophies have sharply conflicted after OpenAI’s first developer conference. Sam is quite radical, while Ilya is overly cautious, leading to a conflict that may have been ignited by GPTs.
Since Sam’s dismissal, everyone has been paying attention to the internal conflict at OpenAI. During these days, GPTs have continued to develop at an astonishing speed, with their number exceeding 20,000. The extremely low barrier to creation and a business model similar to the App Store will undoubtedly allow OpenAI to quickly build an ecosystem of GPTs.
On the other hand, there are still many issues with these GPTs. For instance, in terms of security, 99% of GPTs are operating without protection, and a few sentences can extract knowledge base files. If these GPTs are running on large language models that pose potential threats to humanity, the consequences are unimaginable.
Of course, these are still speculations and not the focus of this article.
In fact, after the launch of GPTs, there has been more dissatisfaction in the venture capital field regarding why OpenAI wants to develop both underlying technology and upper-level applications. This has directly killed a considerable number of projects related to GPT-based Agents, many of which are what Sam referred to as “simple imitations and shell companies of OpenAI”.
Regardless of whether these projects are imitating OpenAI, the launch of GPTs and the Assistant API has indeed had a significant impact on the frameworks and tools for building third-party Agents, with even Langchain and LlamaIndex being regarded as worthless.
Interestingly, some people do not consider GPTs to be true Agents because most current GPTs are merely chatbots that perform specific functions. How can such things replace or kill well-structured, powerful independent Agents?
So, do GPTs count as Agents? Does the launch of GPTs really mean that the Agent products and open-source projects that developers have been building for months will die? Do GPTs really have the capability to kill all AI Agents?
In this article, Wang Jiwei’s channel will discuss these topics.
Starting with GPTs
The official definition of GPTs by OpenAI is a version of ChatGPT created by users for specific purposes.
Anyone can create customized GPTs for daily life, specific tasks, or to gain more convenience and efficiency at work or home. They can also create GPTs for internal company use, such as helping children with math, designing stickers, learning board games, searching for resources, data analysis, etc.
Additionally, users can share the GPTs they create to allow more people to use them and enhance efficiency in various scenarios. For more details about GPTs, you can check the official OpenAI blog for the article “Introducing GPTs“. (For those without access to the English version, Wang Jiwei’s channel has prepared a Chinese version, which can be obtained by sending a message “GPTs” in the backend.)
Building a GPT is also very simple; no coding is required. You just need to converse with the GPT Builder (the GPT creation tool launched by OpenAI) and provide it with instructions and other knowledge, then select the operations that the GPT can perform, such as searching the web, creating images, analyzing data, and a GPT is created.
GPTs can do many things, such as learning the rules of board games, helping children learn, or designing stickers. They can also connect GPTs with external services to access more information and functionalities. For example, by connecting to a translation API, a GPT can access databases to obtain real-time data for analysis, enabling communication in multiple languages.
To let users experience the charm of GPTs, OpenAI has launched 16 different GPTs that users can directly use. When building GPTs, users can also choose whether to use DALL-E for image generation or a code interpreter.
The 16 GPTs are as follows:
- DALL·E GPT: Turn your imagination into images.
- Data Analysis: Input any file to help analyze and visualize your data.
- ChatGPT Classic: The latest version of GPT-4, without additional features.
- Game Time: Quickly explain board games or card games to players of any age.
- The Negotiator: Help you advocate for yourself and achieve better outcomes, becoming an excellent negotiator.
- Creative Writing Coach: Eager to read your work and provide feedback to improve your skills.
- Cosmic Dream: A visionary digital wonder painter.
- Tech Support Advisor: Step-by-step assistance from setting up printers to troubleshooting devices.
- Coloring Book Hero: Turn any idea into whimsical coloring book pages.
- Laundry Buddy: Answer any questions about stains, settings, sorting, and everything laundry-related.
- Sous Chef: Provide recipes based on your favorite foods and available ingredients.
- Sticker Whiz: Turn your wildest dreams into die-cut stickers delivered right to your door.
- Math Mentor: Help parents assist their children in learning math.
- Hot Mods: Modify your images into something truly wild.
- Mocktail Mixologist: Create non-alcoholic cocktail recipes with any ingredients you have on hand, making any party shine.
- genz 4 meme: Help you understand slang and the latest memes.
The launch of these different GPTs by OpenAI not only showcases the technical strength of the GPT model but also signifies that personalized AI assistants will become an indispensable part of our daily lives, meeting our unique needs and interests in the future.
From the various GPTs that have already been launched, some, like those using Zapier plugins, can handle slightly more complex business processes, but most GPTs are still just chatbots and cannot execute complex tasks.
So, do GPTs count as Agents?
Examining GPTs from the Definition and Architecture of Agents
After the OpenAI developer conference, Bill Gates published a blog post titled “AI is about to completely change how you use computers,” which quickly went viral both domestically and internationally. (Reply “GPTs” in the backend to obtain the PDF version of this article in Chinese.)
In this article, he mentioned the differences between Agents and robots (like Clippy) in three main points:
-
Proactively propose solutions based on user needs;
-
Ability to complete tasks across applications;
-
Improve over time.
According to these points, currently, aside from some GPTs that can participate in business processes (such as those using Zapier plugins to call APIs for CRM, HR, and other related business applications), most GPTs are just conversational robots like ChatGPT.
This is understandable, as the purpose of GPTs is to customize a unique ChatGPT for users, and many people’s needs may simply be to generate content through conversation.
However, the inclusion of Actions during the creation of GPTs has given some GPTs execution capabilities, making them much more powerful than ordinary robots, sufficient to connect to parts of the real world.
We can also place GPTs into the currently recognized ideal Agent framework proposed by OpenAI, which consists of “LLM + Planning + Memory + Tools”.
It can be observed that most GPTs have not yet reached the standards of AI Agents in terms of tool usage, as they have only uploaded a knowledge document in the “knowledge” section, merely functioning as a conversational robot to retrieve knowledge related to the document without involving tool usage.
These types of GPTs can only think based on the input instructions and provide users with text, images, and other content feedback, but cannot execute certain goals, such as operating software to complete related tasks.
In fact, the GPT Builder used to create GPTs is a standard Agent. After users submit their requirements, the GPT Builder sets goals and breaks down tasks, guiding users step by step to complete the construction of GPTs, even generating logos automatically based on instructions.
GPTs showcase relevant Agent functionalities and confirm the feasibility of Agents connecting to the real world. These GPTs can connect to other products and services, from emails to shopping websites, allowing AI to perform a wider range of tasks.
OpenAI has made more people aware of what an AI Agent is through GPTs, leading some to refer to GPTs as the pioneers of the next wave of artificial intelligence.
So far, most GPTs lack the level of autonomy that users expect and do not reach the level of Autonomous Agents. In fact, even Sam Altman has not claimed that GPTs are true Agents; he used the term “Precursors” at the developer conference to indicate that GPTs belong to the “early forms” of Agents.
Thus, in some discussions about the viewpoints of GPTs and AI Agents, we can find that GPTs are seen as “almost becoming Agents” or “quasi-AI Agents”.
There is still a gap between “almost” and “is”.
What are the differences between GPTs and Agents, especially Autonomous Agents?
Differences Between GPTs and AI Agents
Among the Agent projects that people claim GPTs will kill, some projects like Baby AGI, MetaGPT, and Aiagent clearly exhibit the characteristics that a qualified Agent should possess during their operation. In other words, their performance is much stronger than that of GPTs.
After the OpenAI developer conference, LangChain emphasized its differences from GPTs and its own advantages in a tweet, and on November 10, it launched an open-source project called Opengpts.
This project aims to provide users with a platform that offers a similar experience to OpenAI GPTs by integrating LangServe and LangSmith. Compared to OpenAI, which can only use the GPT model to build GPTs, Opengpts allows users to choose different language models, customize tools, and control prompts for more flexible control over chatbots.
From the current performance of GPTs and the functionalities of “independent” AI Agents, the two have the following differences:
1. GPTs are still in the trial phase.
Although the number of GPTs collected by GPT Shunter (a third-party GPT Store project) has exceeded 21,000, most GPT products are still relatively primitive.
The attributes currently exhibited by GPTs are more suitable for sharing, but their functionalities are still lacking, belonging to the trial phase of personal entertainment and application products, and are not suitable for widespread enterprise use.
2. Technical stack limitations.
GPTs are built on the large language model GPT-4 and are closely tied to OpenAI’s ecosystem, which means that the overall technical stack available to developers is somewhat limited.
GPT is not open-source, so building GPTs can only be based on GPT, without the option to choose more LLMs. Additionally, the current version has some usage limitations, such as being able to upload a maximum of 10 data files.
3. Varying skill levels of GPT builders.
The builders of “independent” AI Agents are mainly developers, while most GPT builders are business personnel who do not code. Currently, most GPTs are constructed using simple command prompts, leading to a rapid increase in the number of GPTs, but also making them much less professional, more suitable for self-entertainment or solving simple business processes.
Programmers can also use the Assistant API to build more functional and professional GPTs, which are more likely to become enterprise-level Agent applications.
4. Tasks that can be handled and the capabilities of GPTs.
Currently, AI Agents are becoming various types of intelligent assistants that can be used for ordering food, booking flights, and programming, among other relatively complex business processes. GPTs also have many different uses, such as personal trainers, teachers, consultants, etc., but most are still chatbots.
They resemble various role-playing AIs, where people can customize various roles for entertainment or to handle simple tasks, such as generating various texts and using DALL-E to create images.
Of course, GPTs can participate in some business processes, such as accessing calendars or Slack through Zapier GPT, but they still cannot delve into complex processes in enterprise operations, such as those in SAP, Yonyou, or Kingdee.
This is mainly because many enterprise management software lacks APIs, and some API authorization fees are too high, in addition to the fact that APIs are not absolutely stable.
5. Technical and security challenges
Current AI Agents are often criticized for being unreliable, which has prevented them from achieving large-scale enterprise applications. GPTs face the same issues, such as hallucinations, providing different results under the same prompt, and not truly understanding the underlying processes, leading to random outcomes.
In addition to the issues with the large models themselves, the bigger problem with GPTs is data security. It is said that currently, 99% of GPTs are operating without protection, and a few sentences can extract the database of GPTs. These issues will make enterprises more cautious in their selection of GPTs.
Theoretically, more advanced models or products built around Agents can compensate for the lack of reliability. For example, the RPA Agent launched by Real Intelligence has made significant efforts in data security, implementing multiple security mechanisms in both the large language model and the RPA toolkit to ensure safer use of AI Agents for users.
6. Lack of product attributes in early-stage GPTs
So far, GPTs lack specific product characteristics, or rather, a method to conduct business using GPTs. Products that do not possess product-level application qualities will struggle to meet enterprise requirements in terms of security, application, data, scalability, and solutions, making it difficult to promote them within enterprises.
Moreover, GPTs are only available for paid ChatGPT users and enterprise users, limiting access for more people, and there is no pricing strategy or product tier differentiation options. Perhaps these will have to wait until the official launch of the GPT Store. Currently, the internal conflict at OpenAI has just concluded, and when the GPT Store will launch remains a mystery.
Will GPTs kill AI Agents?
Although the GPTs launched by OpenAI are not yet mature AI Agents, or are in the early stages of Agents, they undoubtedly respond to a trend that Agents will be ubiquitous. The form of large language model products like GPTs will allow everyone to use Agents, which is its greatness.
As the GPT Store is launched, GPTs will exist on everyone’s phones, tablets, or other forms (such as the recently popular AIpin) as communication, entertainment, and office products, just like current apps.
Currently, GPTs are still quite primitive, with most being customized chatbots aimed at specific functions, such as psychological counseling, product descriptions, text and image generation, etc.
However, from a business process perspective, many business departments in enterprises, such as marketing, customer support, new media, HR, and legal, have most of their processes involving text, voice interaction, and generation. The application of GPTs is sufficient to complete most tasks, and under safe and compliant conditions, these departments will find GPTs very suitable.
If simple GPTs can handle various business scenarios in enterprise operations, is there still a need to expend energy and resources to create so-called professional standalone Autonomous Agents? At the same time, is the SaaS-based development of GPTs more convenient and efficient than programmers building professional Agents with code?
Currently, GPTs cannot intervene in the complex processes of enterprise operations, but we have also seen integrations with email, travel websites, and payment software through plugins like Zapier, which can already operate some business processes.
The internal application of GPTs in enterprises is another topic that needs exploration. Wang Jiwei’s channel will briefly discuss this.
Some enterprises are already building and sharing GPTs internally, customizing ChatGPTs for different business scenarios. For example, companies like Amgen, Bain, and Square have already begun applying their proprietary GPTs. However, it remains unclear whether these enterprise GPTs are used for content generation and understanding or for deep business operations.
Various plugins and applications called through APIs fall under tool applications in OpenAI’s Agent architecture. These tools can range from simple email list reading to complex CRM, OA, workflow orchestration, and management.
OpenAI has not yet introduced more heavyweight tools, but one of its investments, an RPA company called Induced AI, has a product in the form of “RPA 3.0” based on GPT.
Boldly speculating, this product may likely become one of the many tools in OpenAI’s Agent architecture in the future, either as a plugin or in another form, potentially compensating for GPTs’ inability to operate non-API tools in business process execution.
If Induced AI can achieve this, other RPA vendors can do the same. As more RPA vendors launch corresponding plugins, operating more complex processes in organizational operations with GPTs will no longer be a dream. Especially now, with the Assistant API, transforming existing products into GPT-based Agents has become unprecedentedly simple.
Wang Jiwei’s channel believes that theoretically, with the combination of APIs and RPA, GPTs can reach all corners of organizational operations. It depends on how enterprises measure its operational effectiveness and whether it can withstand security tests.
Given the above points, GPTs could indeed become Agent killers, at least they have made the path for many third-party Agents based on GPT-4 difficult.
Fortunately, there are not only OpenAI among LLM vendors.
The AI Agent ecosystem is not solely dominated by OpenAI
Today, when we talk about Agents, they are all based on LLMs, which rely on LLM support.
Regarding the future ecosystem of AI Agents, Bill Gates believes that no single company will dominate the AI Agent business, but rather many different AI engines will be available.
More competition will make intelligent agents, including GPTs, very cheap, benefiting more people who use AI Agents.
There are so many large language models globally, with over 200 in China alone. Since OpenAI can create GPTs, other LLM vendors can naturally launch similar products or collaborate with third-party platforms to release similar products.
Therefore, GPTs will not only be born at OpenAI; tech giants like Google and Meta will certainly hope that their customers develop GPT-like products and more complete Agent products based on their own large models.
During the internal conflict at OpenAI, companies like Amazon and Meta have already received more inquiries related to AI; the inquiry volume for OpenAI’s competitor Cohere has also increased significantly; the interest of enterprise clients in Writer’s services has doubled; and Habib has been promoting its AI system as better than the GPT-3.5 model in certain scenarios.
This internal conflict has indeed had a significant impact on AI technology procurement, as expressed by Yoav Shoham, co-founder of AI21, stating that what happened at OpenAI has made more enterprises certain that they do not want to put all their eggs in one basket.
As for the domestic market, not only is it unable to apply overseas large models like GPT, but it will also derive more diversified demands due to trustworthiness and innovation, leading to the emergence of more distinctive GPT-like products.
Moreover, relying solely on GPT as a large language model cannot meet users’ extensive demands for GPTs. In the future, many GPTs may need to develop more features and functionalities outside of OpenAI, and developers will build more complex products around GPTs.
From this perspective, LLM vendors and Agent vendors may strive to adapt to more large language models, and it is not impossible that OpenAI will also incorporate third-party LLMs into its product system to support users in building various types and functionalities of GPTs.
In fact, for AI Agents to truly achieve large-scale business scenarios and better commercialization in the B-end, it is necessary to comprehensively consider their own security, whether the technology development cycle is mature, and whether the B-end scenarios are closely aligned, as well as factors such as interface costs, privacy, management, and authorization.
This is both a technical and product threshold for many suppliers and an important basis for enterprises in their selection process.
When enterprises choose AI Agents for business process automation, they will prioritize AI Agent products launched by technology suppliers rather than opting for immature single-agent solutions that connect various plugins through APIs offered by LLM vendors.
These are capabilities that the current single-agent GPTs cannot possess. As for when GPTs will develop into mature intelligent agent products, it depends on how OpenAI works on the enterprise user side.
In Wang Jiwei’s view, GPTs have indeed stifled some Agent-related startups, but most are projects that Sam Altman referred to as “shells and imitations of GPT”. For AI Agents, GPTs currently do not exhibit killer-level strength, and they cannot kill those complex types of Agents built for proprietary functions.
The emergence of GPTs has instead inspired more innovation among enterprises, leading to a massive explosion of Agent products in the short term, rapidly constructing and improving the AI Agent ecosystem.
Perhaps leading the prosperous ecosystem of Agents with the GPTs paradigm to achieve AGI sooner is what OpenAI aims to accomplish.
The end of the article【End of Article Benefits】: Send a message “GPTs” in the backend to obtain Bill Gates’ “AI Agents are about to completely change how you use computers” and the PDF of the official OpenAI GPTs introduction article.
RECOMMEND
Recommended Reading
1. The contemporary RPA built on AI, how long will its lifecycle last under the influence of generative AI?2. Many vendors introduce ChatGPT, integrating and merging generative AI becomes a new trend in RPA technology.3. [Ten Thousand Words Long Article] A global inventory of AI Agents, 60 AI Agents that large language model startups must refer to.4. A brief history of AI Agent development, from philosophical enlightenment to the landing of artificial intelligence entities.5. The ultimate development direction of RPA aims at AI Agents, the era of super-automated intelligent agents has begun.
AIGC Research Series Articles
A brief history of AI Agent development, from philosophical enlightenment to the landing of artificial intelligence entities.
[Ten Thousand Words Long Article] A global inventory of AI Agents, 60 AI Agents that large language model startups must refer to.
The ultimate development direction of RPA aims at AI Agents, the era of super-automated intelligent agents has begun.
From large language models to large process models, the paradigm shift brought by generative AI in BPM.
Industry upstream and downstream work together, LLM advances to the edge, large language models accelerate landing, benefiting super-automation.
From introducing and integrating multiple LLMs to releasing self-developed models, how is the progress of RPA and LLM integration?
ChatGPT and RPA integration, generative AI + automated processes make AIGC value multiply.
Industry upstream and downstream work together, LLM advances to the edge, large language models accelerate landing, benefiting super-automation.
Business processes will be transformed by generative AI, AIGC led by ChatGPT is changing organizational operations.
More organizations are integrating ChatGPT and other generative AI, generative automation may become a new standard for enterprise operations.
The AIGC model is influencing more organizations, ten cases to help you deeply understand generative AI.
Many vendors introduce ChatGPT, integrating and merging generative AI becomes a new trend in RPA technology.
The contemporary RPA built on AI, how long will its lifecycle last under the influence of generative AI?
From the ChatGPT data leak incident, the importance of organizational security and stable automation is highlighted.
The new business logic on large model APIs, generative AI transforms organizational management.
What is the relationship between generative AI and customer experience? How does it affect customer experience? This article explains it clearly.
From several business scenarios and actual cases, we see the application of generative AI in the financial sector.
Generative AI sweeps PPT production, office productivity welcomes a major change, with 20 popular AI PPT production tools attached.
From RPA + AI to RPAxAI, Hongji embarks on a new path of LLM integration.
With the arrival of the LLM era, will generative AI become a catalyst for the vigorous development of super-automation?
The AIGC continues to be popular, and large models are competing to launch, creating a huge market that leads to the evolution of computing power supply models.
From “human + RPA” to “human + generative AI + RPA”, how does LLM affect RPA human-computer interaction?
Generative AI is disrupting the decoration and renovation field [with 28 AI decoration design tools attached].
From the characteristics of LLM and the essence of digital transformation, we see the impact of large language models on digital transformation.
In the first half of 2022-2023, a global RPA financing review: overseas projects account for 67%, totaling 16.5 billion yuan.
From AI models changing outfits to AIGC empowering operations, generative AI is permeating the entire e-commerce industry chain.
- Looking forward to likes, views, comments, and shares; your support is my motivation.
-
Encouraging active comments; your messages can become topics.
-
Welcome to read other articles, which may inspire more of your thoughts.
Click the lower left corner “Read the original text” to view the AIGC research series articles, reply “Join group” in the backend to apply to join the AIGC industry application exchange community.
Note:For RPA-related articles, reply with the keyword RPA.[Wang Jiwei’s channel focuses on AIGC and IoT, specializing in digital transformation, business process automation, and RPA. Public account ID: jiwei1122, welcome to follow and communicate.]