AI Roadmap for Robotics: Leading Scholars Collaborate to Outline the Development Blueprint for the Next Decade

Introduction by Long Ge: When AI meets robotics, is it a celebration of technology or an ethical challenge? Leading scholars from around the world have collaborated to outline a development blueprint for the next decade, inviting us to explore the infinite possibilities of AI in robotics!

The original paper information is as follows:
Paper Title: A roadmap for AI in robotics
Publication Date: July 2025
Authors: Aude Billard, Alin Albu-Schaeffer, Michael Beetz, Wolfram Burgard, Peter Corke, Matei Ciocarlie, Ravinder Dahiya, Danica Kragic, Ken Goldberg, Yukie Nagai, Davide Scaramuzza
Affiliations: École Polytechnique Fédérale de Lausanne, DLR-German Aerospace Center, University of Bremen, University of Technology Nuremberg, Queensland University of Technology, Columbia University, Northeastern University, Royal Institute of Technology, University of California Berkeley, The University of Tokyo, University of Zurich
Original Link: https://www.nature.com/articles/s42256-025-01050-6
Imagine a robot that can learn new skills like a human, adapt to changing environments, and even collaborate with other robots to complete complex tasks. This sounds like a plot from a science fiction movie, but according to the latest roadmap from leading roboticists, this could be a reality in the next decade.

A team of experts from 11 top research institutions, including the École Polytechnique Fédérale de Lausanne, the German Aerospace Center, and Columbia University, has published an important paper in Nature – Machine Intelligence, detailing a roadmap for the future development of AI in the field of robotics.

What type of AI is suitable for robotics? Challenges proposed for the safe, ethical, and sustainable deployment of human-compatible robots

While deep learning and large language models have made significant breakthroughs in many fields, directly applying these technologies to robotic operations in the physical world faces unique challenges. The complexity of acting and perceiving in the real world far exceeds the difficulties of processing isolated data.

Significant differences in state space: The state space of the physical world is much larger than that of games or text generation, making training data difficult to obtain and generate, while safety and reliability are uncompromising requirements.

Safety criticality: In scenarios involving human interaction, every action of a robot can have real consequences, requiring AI systems to possess interpretability and transparency, not just performance optimization.

The paper points out that robotics needs AI algorithms specifically designed to address the challenges of the physical world, which must be general enough to apply to a wide range of applications while being easily transferable to various robotic platforms.

Potential new applications and commercial deployment: How AI technology drives robots into real life

Many AI technologies originating from academic research have begun to find practical applications in the commercial sector. AI-driven robots are widely deployed in e-commerce warehouses, capable of sorting packages of various sizes.

In pick-and-place tasks on assembly lines, learning capabilities enable online adaptation. Robots can now adjust their trajectories to cope with misalignment, unexpected changes in shape or weight, and other situations.

Breakthroughs in soft robotics: Particularly interesting is the application of AI in the field of soft robotics. The flexibility and continuous characteristics of soft robotic bodies, along with their complex interactions with the environment, make sensor data processing, state estimation, and control particularly challenging.

The natural compliance of soft robots may simplify the difficulty of using robots in areas requiring direct interaction with humans and address global issues through biodegradable solutions.

A notable example is the recent application of convolutional neural networks to interpret the vast amounts of data flowing from soft robotic gloves, achieving real-time recognition and control of object grasping.

Short-term and mid-term challenges: From dataset creation to breakthroughs in simulation technology

The potential of reinforcement learning, imitation learning, and other AI technologies in robotics is just beginning to be explored. In the short and mid-term, many challenges lie ahead—from software and hardware development to theoretical and algorithmic advancements.
Figure 1: Short-term and long-term challenges recognized further in AI in robotics, enabling robots to learn continuously and autonomously while interacting with humans and other robots. The challenges are arranged in increasing order of complexity but may not be solved sequentially. Instead, many aspects of research along these directions are conducted in parallel.

From simulation to reality and back: When it is not possible to create sufficiently large datasets, simulation provides a partial solution. Multiple robot simulators are available to the robotics community and have long been used to test and improve classical model-based control algorithms before applying them to real robots.

However, overcoming the reality gap—i.e., the difference in performance between robots in the real world and in simulated environments—remains a challenge. This gap may result from multiple factors: the models of the simulator may be overly simplified compared to actual physical robots; the variability of environmental conditions may be too great to be captured by the model; and physical simulators may not accurately capture the physical characteristics of the real world, especially when it comes to contact forces and deformable surfaces.

There are many techniques to bridge the reality gap. A small amount of data from the real world can be collected to enhance the realism of the simulator, enabling quadrupedal robots to adapt in real-time to changing terrain, loads, wear, and tear.

Creating and maintaining representative datasets: Why is this crucial for robotics?

Compared to other AI application areas, a fundamental limitation of robotic learning is the lack of readily available, easily accessible large datasets for training artificial neural networks on sensing and control tasks, contrasting sharply with the vast resources of images and text that can be downloaded from the internet for training image recognition or text generation algorithms.

Generating enough iterations of robotic tasks from scratch to train artificial neural networks can be extremely costly and time-consuming, or even impossible. Too many robots may be destroyed during the process of task attempts, and in some cases (such as autonomous flying robots), this can pose risks to humans.

For certain tasks, reference databases can be created, but this requires organized and multi-center efforts. For example, in the case of visual imitation learning, efforts are underway to create a dataset for grasping and manipulation similar to ImageNet.

The Dex-Net research project has developed code, datasets, and algorithms for generating parallel gripper robot grasping and physics-based grasp robustness metrics, supporting researchers in finding robust grasps and training machine learning models to generate rich grasping strategies.

In addition to visual data, robotic learning also requires datasets of robotic action data in the form of trajectories and interaction profiles related to various tasks. While there are datasets for specific robotic bodies and tasks, they are often too narrow to be suitable for large-scale machine learning. Combining datasets from different embodiments and different robotic tasks may be a solution to achieve the required scale.

From simulation to reality and feedback: How to bridge the gap between “simulation and reality”

Simulation technology provides a partial solution for robotic learning, especially when it is not possible to create sufficiently large real datasets. Existing robot simulators like Gazebo and MuJoCo can simulate movement on complex terrains and object manipulation in home environments by improving the accuracy of physical engines, thereby reducing the time required for real-world training.

However, the reality gap remains a core challenge. This gap may stem from the oversimplification of simulation models, the unpredictability of environmental variability, or inaccuracies in the physical engine’s simulation of contact forces and deformable surfaces.

To address this issue, researchers employ various techniques, such as collecting a small amount of real data to enhance the realism of simulations, enabling quadrupedal robots to adapt online to changing terrains and loads. Despite the enormous potential for closed-loop feedback from reality to simulation, the current focus on modifying simulators using real data to bridge the reality-to-simulation gap is far less than on the reverse process.

Utilizing large generative models in robotics: Applications of language models and visual models

The current excitement in the AI field centers around generative AI and large language models (LLMs), which have shown excellent performance in text processing and are expected to drive transformative changes in robot control. LLMs have multi-faceted appeal in robotics, such as supporting human-robot interaction based on natural language, allowing robots to be controlled through written or spoken instructions and respond in human language.

Additionally, language-vision models can generate synthetic images and videos from text prompts by training on text-image pairs or annotated videos from the internet, which can improve object recognition and task specification in robotics. The next generation of large visual models designed for robotics is trained on image datasets from actual navigation environments rather than solely from internet sources, generating reliable predictions for home or office space environments.

The latest advancement is the language-vision-action model, which expresses robot actions as text tokens by fine-tuning visual-language models and incorporating robot trajectory data, allowing the model to output actions similar to the text generated by LLMs. Initial results are promising, but providing suitable datasets, effectively mapping vision to action, and endowing systems with reasoning capabilities to predict the consequences of actions remain core research focuses for the coming years.

Prior knowledge and combining AI with control methods: How to achieve safer and more efficient robotic operations

For physical robots, integrating prior knowledge about robot and environmental dynamics with control methods that have provable guarantees is wiser than purely bottom-up, knowledge-agnostic learning approaches. For example, in the field of aerial robotics, merely learning or using aerodynamics-based control cannot address the challenge of approximating the agility of bird flight; it requires coupling perception with whole-body dynamics, enabling drones to respond instantly, utilize wind, and combine flapping and gliding to save energy.

Another reason for combining model and formal knowledge with machine learning is that purely learning-based systems are prone to unpredictable or unexplainable failures, such as the “hallucinations” seen in LLMs. Many current deep learning models are inherently unexplainable, which is more critical when AI is applied to robotics, as future robots are expected to operate in safety-critical scenarios, such as autonomous navigation or close interaction with humans, where regulators require predictable behavior and performance guarantees.

AI-driven robots need clear models of the actions they are about to perform, and these models must be explicitly represented to reason about consequences. For instance, a robot designed for a chemical laboratory needs to know what happens when acids and bases are mixed; when interacting with humans, robots need to have a practical theory of mind and excellent human behavior prediction capabilities.

Numerous efforts are dedicated to integrating control theory and machine learning, proving that this can accelerate learning, enhance the robustness and safety of learning models. For example, modifying standard machine learning optimization to include penalties for violating theoretical constraints or enhancing deep reinforcement learning training by generating reference motions through control models to ensure stable trajectories or optimize grasping.
Figure 2: The ability to transfer learning across robotic bodies, tasks, and environments is fundamental to achieving collaboration among different robots.

Long-term challenges: Future directions for lifelong learning and transfer learning

The most exciting yet challenging long-term commitment of AI in robotics is enabling robots to continuously acquire new knowledge, a dream dating back to the 1990s. Achieving this requires two elements: lifelong learning and transfer learning.

Lifelong learning: Endowing robots with the ability to learn continuously faces significant technical and regulatory challenges. It requires a new paradigm based on incremental learning that transforms input-output learning into structured knowledge, combining the power of learning with expert system paradigms. This necessitates the parallel operation of learning modules and suitable hardware, leading to difficult questions such as how to ensure system performance, test learning in unknown situations, and select what to forget to learn new things.

The main challenge of lifelong learning is scaling current methods. Many robots do not remain unchanged throughout their operational lifespan; after five to eight years, they may need to install different grippers or motors, and the objects and environments they operate in may change, at which point the acquired knowledge may not automatically transfer to slightly modified platforms.

Transfer learning: Future robots need to be able to transfer what they have learned: from task to task, environment to environment, and robot to robot. Human intelligence relies on the ability to apply knowledge from one domain to new domains; similarly, robots need to analyze and use data-driven methods to learn skills from human demonstrations and transfer those skills to new tasks or different robots and environments.
Figure 3: Transferring packages from drones to humanoid robots or single-arm manipulators requires coordinating distinctly different perceptions (from different perspectives and sensors) and distinct robotic actions (from single-arm to dual-arm actions, inspired by the 2023 EuRobin Hackathon).

Transferable robotic learning needs algorithms to answer three questions: a) What to transfer? Standards need to be developed to select what prior knowledge to ignore or transfer; b) How to transfer? Algorithms need to be designed to use transferred knowledge while filling gaps with additional targeted searches or autonomously seeking new context-specific knowledge; c) When to transfer? Algorithms need to be developed to recognize similarities between environments, objects, and task constraints to determine whether knowledge transfer is possible.

Safe exploration: Ensuring reliability and sustainability of robots in complex environments

Many important real-world applications are characterized by high-dimensional state spaces and incomplete observability, making real-time exploration necessary while considering the limitations of simulation to reality. Investment must be made in methods that allow robots to explore environments effectively and safely, ensuring the safety of robots and their surroundings. This exploration may need to span the entire lifecycle of the robot, as mentioned earlier. Safe exploration must combine hardware and software AI technologies, such as “near-sensor” computing—AI processing sensor data after scaling it on distributed sensing hardware to reduce computational load and energy consumption. Additionally, hardware AI must be characterized by distributed, large-area, and flexible/adaptive features to suit various environments. Sustainability is also crucial; robotic controllers and hardware should be designed sustainably, promoting energy-efficient computing, data and algorithm reusability, and biodegradable hardware.

Outlook: Future challenges and opportunities, from ethics to sustainability

The large-scale deployment of AI and robotics has become a more tangible goal, potentially achievable within the next decade. AI has the potential to significantly expand the capabilities and applications of robots, addressing the challenges of acting in a rapidly evolving world with only a partial understanding of their surroundings and humans.

Transfer learning likely holds one of the few keys to fundamental issues of task and application scalability. However, for large-scale deployment, from manufacturing to logistics and homes, robots need to learn new tasks quickly and effectively. A team of robotic experts collecting data for weeks and running hyperparameter tuning for any new task is clearly not a viable path.

Instead, robots that can leverage prior knowledge and supplement learning through punctual, minimal, and intuitive guidance (such as verbal instructions combined with visual demonstrations) are more likely to ensure continuous and rapid skill acquisition and adaptation.

Designing sparse and efficient use of AI and data will also be necessary, especially when data comes from high-resolution sensors. Ensuring the transparency and interpretability of robot actions is essential to guarantee accountability, prevent bias, misuse, and ensure privacy in robotic data processing, which is critical for social acceptance.

Finally, if robotics can accelerate decarbonization through automated battery recycling, solar panel deployment, and home renovations, it is equally important that robotic controllers and hardware are designed sustainably, promoting energy-efficient computing, data and algorithm reusability, and biodegradable hardware.

Three Questions from Long Ge

Below are answers to some questions you may have from Long Ge:

What is the reality gap?The reality gap refers to the difference in performance of robots in simulated environments versus the real world, which may arise from simplified simulation models, environmental variability, or inaccuracies in the physical engine, and is one of the main challenges in robotic learning.

How are LLMs applied in robotics?LLMs (large language models) can support human-robot interaction based on natural language, allowing robots to be controlled through verbal or text instructions and used for semantic inference in robot navigation, improving adaptability in new environments.

What are the challenges of lifelong learning in robots?Lifelong learning requires robots to continuously acquire new knowledge, facing technical challenges such as incremental learning, knowledge structuring, and regulatory issues like performance guarantees and safety standards maintenance, especially when knowledge transfer is difficult due to platform or environmental changes.

If you have any other questions you would like to know, feel free to leave a comment or discuss in the comments section~