Insights on Patents: AI Agent Technology Innovations and Patent Strategy

Insights on Patents: AI Agent Technology Innovations and Patent StrategyInsights on Patents: AI Agent Technology Innovations and Patent Strategy

AI Agent

Technology Innovations and Patent Strategy

Insights on Patents

2025 will mark the inaugural year for “AI Agents”, with the commercialization of various “AI Agents” accelerating further. In this context, we need to understand the concept of “AI Agents”, the emergent “creativity” they possess, their application scenarios, and future development trends. Additionally, we need a clear strategy for patenting the technological innovations based on “AI Agents” to provide robust support for the company’s future competitive landscape.

Insights on Patents: AI Agent Technology Innovations and Patent Strategy

After the startup Monica officially released its general-purpose AI Agent product Manus, various media outlets began promoting this “AI Agent” product, leading to a surge in interest. As for the level of technological innovation of this product, we will refrain from commenting for now, allowing the situation to develop as it may. However, the release of Manus has at least increased public understanding of “AI Agents (Artificial Intelligence Agents)”.

To understand “AI Agents”, we need to consider four levels:The first level: What is an “AI Agent”?

In simple terms, an “AI Agent” is an intelligent system capable of perceiving its environment, making autonomous decisions, and executing tasks to achieve specific goals. It integrates artificial intelligence technologies (such as machine learning, natural language processing, etc.) and can dynamically adjust its behavior based on input data, all without human intervention.

Open AI once designed a diagram about “AI Agents”, as shown in Figure 1:

Insights on Patents: AI Agent Technology Innovations and Patent Strategy

Figure 1☝

In summary, “AI Agent” = large model + memory + tool usage (including software, hardware, etc.).

In Figure 1, “short-term memory” can be prompt engineering (e.g., prompts); “long-term memory” can be stored separately in an external database; “planning (reflection, self-reflection, thought chains, sub-goal decomposition)” can be executed by the large model; and “tool” invocation can be performed by the large model through corresponding API calls.

For example, a specific business workflow could be:

Sending prompts and/or images to the large model;

The large model/large model combined with long-term memory decomposes the task;

The large model calls appropriate tools, takes action, and during the entire action, reflects on and corrects errors, ultimately completing the task.

The second level: Does the “AI Agent” possess emergent “creativity”?

Is the “AI Agent” merely utilizing the existing capabilities of the large model (a simple “shell”) and calling existing tools, or does it exhibit some “creativity” during task execution?

To address this question, we can look at a specific example.

This is a real scenario that occurred when the R&D team at Monica was working on a problem in the GAIA test set. The problem was: in a YouTube video link styled like National Geographic, various penguins come in and out of the frame, and Manus was tasked with counting how many different types of penguins appeared simultaneously in the most populated frame.

At this moment, another surprising event occurred:

“After Manus opened the video link, its first action was to press ‘K’,” ultimately determining that the most populated frame contained 3 types of penguins. Many people might not know what ‘K’ represents. In fact, ‘K’ is the pause key, allowing Manus to pause and take screenshots to record which types of penguins appeared in each frame.

Next, Manus began checking the results, and “its next action was to press ‘3’…”. Many might also not know what ‘3’ represents. In fact, ‘3’ is a shortcut key that corresponds to 30% of the progress bar, allowing precise location to that second of the video, then informing humans how many types of penguins were in that frame.

Ultimately, the answer given after checking was 3.

From the execution process of the above task, “surprises always happen”, such as: “After Manus opened the video link, its first action was to press ‘K’,” and “its next action was to press ‘3’…”. This process differs from traditional chatbots: it observes the YouTube visuals rather than the subtitles; simultaneously, it employs a series of shortcut keys, which is quite astonishing.

Thus, it can be concluded that “AI Agents” do not merely utilize the existing capabilities of the large model (a simple “shell”) and call existing tools; they also understand all methods and means during tool usage and choose the optimal approach. This indicates that “AI Agents (Artificial Intelligence Agents)” exhibit some emergent “creativity” during task execution. The emergence of this creativity occurs under conditions of “sufficiently high-quality data”, “sufficiently intelligent models”, “sufficiently flexible architectures”, and “sufficiently solid engineering”, where concepts like Computer Use, Deep Research, and Coding Agents transition from product features to naturally emergent creativity.

The third level: What are the application scenarios for “AI Agents”?

Based on “AI Agents” = large model + memory + tool usage (including software, hardware, etc.), by integrating existing large models, functionalities such as chatting, searching, reading, writing, and translating can be combined, integrating many task execution scenarios through API connections.

For example, application scenarios for “AI Agents” include but are not limited to: generating visual analysis reports in finance & investment; providing decision support in healthcare; resume screening analysis in enterprise services; optimizing supply chain management; assisting in education; analyzing operational data in e-commerce & retail; planning travel in travel and lifestyle services; and coding, testing, and deploying in technology development and innovation.

The fourth level: Future development trends of “AI Agents”

With the rapid iteration of large models becoming increasingly intelligent and the computational power required continuously decreasing (for example, the emergence of DeepSeek further reduces computational consumption), “AI Agents” will gradually see widespread application in areas such as “multi-agent collaboration”, “personalized interaction”, and “edge computing integration”.

“Multi-agent collaboration” refers to multiple “AI Agents” working together to complete complex tasks, such as logistics fleet scheduling; “personalized interaction” refers to endowing “AI Agents” with emotional recognition and expression capabilities, such as psychological counseling assistants; and “edge computing integration” refers to running “AI Agents” directly on local devices (like smartphones, IoT devices) to reduce latency.

Having introduced the concept of “AI Agents”, their emergent “creativity”, application scenarios, and future development trends, the pressing question now is how to effectively manage patent strategies for technological innovations based on “AI Agents” to support future commercial competition.

Regarding the patent strategy for “AI Agents”,there are several key points worth our attention:

The first point: Patent strategy around “AI Agents” + application scenarios

Since “AI Agents” = large model + memory + tool usage (including software, hardware, etc.), they are inseparable from application scenarios. Therefore, when “AI Agents” are combined with specific application scenarios, the process of executing tasks often involves integrating knowledge bases (long-term memory) and calling external tools, during which various innovations may arise, such as knowledge slicing organization in the knowledge base, and optimization/redesign of business processes (including the emergent creativity of “AI Agents” in certain business process stages during task execution).

For example, an AI assistant (“AI Agent”) for smart car owners, when faced with questions about vehicle maintenance, would first call the “vehicle maintenance knowledge base” (which may use a unique knowledge fragment organization format) to provide answers, and based on the Q&A results, further query the large model or the large model + external API calls (e.g., 4S store maintenance records), ultimately outputting results to the car owner. Of course, the output may also be presented in the form of “simulation” (emergent creativity) showing potential future issues with vehicle components.

The second point: Patent strategy for technological innovations in multi-agent business collaboration

Some tasks require collaboration between multiple “AI Agents” to complete, such as logistics fleet scheduling, where each vehicle can be an “AI Agent”. During task execution, each vehicle must continuously interact with the external environment (e.g., road conditions, weather) and also interact and collaborate with each other.

The technological innovations made during these interactions and collaborations also need to be protected through patents.

The third point: Patent strategy for various business innovations derived from embedding “AI Agents” in hardwareAs large models become increasingly intelligent and require less computational power, integrating “AI Agents” into edge computing devices becomes a reality, such as embedding “AI Agents” in “humanoid robots”, “smartphones”, and “smart TVs”. For example, a “humanoid robot” integrated with an “AI Agent” can serve as a psychological counseling assistant for the elderly, preventing depression. The “AI Agent” can also call external API interfaces to control smart appliances at home (e.g., TV, air conditioner, washing machine, vacuum cleaner, air purifier). In emergencies, it can directly call hospitals or the elderly’s relatives for help and take some first aid measures in advance (based on a medical assistance knowledge base + large model + hospital records), delaying the elderly’s condition. Of course, there are many other business innovations derived from embedding “AI Agents” in hardware, and future technological innovations in this area will also be a key focus for patent strategies that need to be emphasized.In summary

We have gained a profound understanding of “AI Agents” (the concept of “AI Agents”; the emergent “creativity” of “AI Agents”; application scenarios of “AI Agents”; future development trends of “AI Agents”), and we have a clear strategy for the patenting of technological innovations based on “AI Agents” (patent strategy around “AI Agents” + application scenarios; patent strategy for technological innovations in multi-agent business collaboration; patent strategy for various business innovations derived from embedding “AI Agents” in hardware), which provides significant support for the company’s future commercial competition.

END

Insights on Patents: AI Agent Technology Innovations and Patent Strategy

Insights on Patents: AI Agent Technology Innovations and Patent Strategy

Leave a Comment