Envisioning the Robotics Industry: The Evolution of Automation, Artificial Intelligence, and Web3 Integration

The traditional robotics industry chain has formed a complete hierarchical system from the bottom up, covering four major links: core components—intermediate control systems—complete machine manufacturing—application integration. The core components (controllers, servos, reducers, sensors, batteries, etc.) have the highest technical barriers, determining the performance and cost limits of the complete machine; the control system is the “brain and cerebellum” of the robot, responsible for decision-making planning and motion control; complete machine manufacturing reflects the ability to integrate the supply chain. System integration and application are becoming the new core of value, determining the depth of commercialization.

Author: Jacob Zhao @IOSG (Twitter: @IOSGVC), Columnist at Web3Caff

Original Title: IOSG Weekly Brief｜Envisioning the Robotics Industry: The Evolution of Automation, Artificial Intelligence, and Web3 Integration #301

Envisioning the Robotics Industry: The Evolution of Automation, Artificial Intelligence, and Web3 Integration

This independent research report is supported by IOSG Ventures. Thanks to Hans (RoboCup Asia-Pacific), Nichanan Kesonpat (1kx), Robert Koschig (1kx), Amanda Young (Collab+Currency), Jonathan Victor (Ansa Research), Lex Sokolin (Generative Ventures), Jay Yu (Pantera Capital), Jeffrey Hu (Hashkey Capital) for their valuable suggestions on this article. During the writing process, feedback was also solicited from teams such as OpenMind, BitRobot, peaq, Auki Labs, XMAQUINA, GAIB, Vader, Gradient, Tashi Network, and CodecFlow. This article aims for objective and accurate content, and some viewpoints involve subjective judgments, which may inevitably contain biases; readers are kindly asked to understand.

This article was assisted by AI tools such as ChatGPT-5 and Deepseek during the writing process. The author has made efforts to proofread and ensure the information is true and accurate, but there may still be omissions; your understanding is appreciated. It should be particularly noted that the cryptocurrency market generally exhibits a divergence between project fundamentals and secondary market price performance. The content of this article is for information integration and academic/research exchange only, does not constitute any investment advice, and should not be regarded as a recommendation for buying or selling any tokens.

Robotics Panorama: From Industrial Automation to Humanoid Intelligence

The traditional robotics industry chain has formed a complete hierarchical system from the bottom up, covering core components—intermediate control systems—complete machine manufacturing—application integration four major links. Core components (controllers, servos, reducers, sensors, batteries, etc.) have the highest technical barriers, determining the performance and cost limits of the complete machine; the control system is the “brain and cerebellum” of the robot, responsible for decision-making planning and motion control; complete machine manufacturing reflects the ability to integrate the supply chain. System integration and application are becoming the new core of value, determining the depth of commercialization.

According to application scenarios and forms, global robots are evolving along the path of “industrial automation → scene intelligence → general intelligence” forming five major types: industrial robots, mobile robots, service robots, special robots, and humanoid robots

Industrial Robots

The only fully mature track currently, widely used in welding, assembly, painting, and handling manufacturing processes. The industry has formed a standardized supply chain system, with stable gross margins and clear ROI. Among them, the subclass collaborative robots (Cobots) emphasize human-machine collaboration, lightweight deployment, and are growing the fastest.

Representative Companies: ABB, Fanuc, Yaskawa, KUKA, Universal Robots, JAKA, AUBO.

Mobile Robots

Including AGV (Automated Guided Vehicle) and AMR (Autonomous Mobile Robot), widely deployed in logistics warehousing, e-commerce delivery, and manufacturing transportation, becoming the most mature category for B-end.

Representative Companies: Amazon Robotics, Geek+, Quicktron, Locus Robotics.

Service Robots

Targeting industries such as cleaning, dining, hospitality, and education, this is the fastest-growing area on the consumer side. Cleaning products have entered the consumer electronics logic, and medical and commercial delivery are accelerating commercialization. In addition, a batch of more general operational robots is emerging (such as Dyna’s dual-arm system)—more flexible than task-specific products but not yet reaching the generality of humanoid robots.

Representative Companies: Ecovacs, Roborock, PuduTech, Qianlong Intelligent, iRobot, Dyna, etc.

Special Robots

Mainly serving medical, military, construction, marine, and aerospace scenarios, the market size is limited but profit margins are high, with strong barriers, relying on government and enterprise orders, and are in a vertical segmentation growth stage. Typical projects include intuitive surgery, Boston Dynamics, ANYbotics, NASA Valkyrie, etc.

Humanoid Robots

Seen as the future “general labor platform”.

Representative Companies: Tesla (Optimus), Figure AI (Figure 01), Sanctuary AI (Phoenix), Agility Robotics (Digit), Apptronik (Apollo), 1X Robotics, Neura Robotics, Unitree, UBTECH, Zhiyuan Robotics, etc.

Humanoid robots are currently the most focused frontier direction, with their core value being the adaptation to existing social spaces with a humanoid structure, seen as the key form towards a “general labor platform”. Unlike industrial robots that pursue extreme efficiency, humanoid robots emphasize general adaptability and task transferability, capable of entering factories, homes, and public spaces without modifying the environment.

Currently, most humanoid robots are still in the technical demonstration stage, mainly verifying dynamic balance, walking, and operational capabilities. Although some projects have begun small-scale deployment in highly controlled factory scenarios (such as Figure × BMW, Agility Digit), and it is expected that more manufacturers (such as 1X) will enter early distribution starting in 2026, these are still “narrow scene, single task” limited applications, rather than truly general labor implementations. Overall, it will take several more years to achieve large-scale commercialization. Core bottlenecks include: multi-degree-of-freedom coordination and real-time dynamic balance control challenges; energy consumption and endurance issues limited by battery energy density and drive efficiency; instability in open environments and difficulty generalizing perception-decision links; significant data gaps (difficult to support general strategy training); cross-body transfer not yet conquered; and hardware supply chain and cost curves (especially outside of China) still pose real barriers, making large-scale, low-cost deployment even more challenging.

The future commercialization path is expected to go through three stages: short-term focusing on Demo-as-a-Service, relying on pilots and subsidies; mid-term evolving into Robotics-as-a-Service (RaaS), building task and skill ecosystems; and long-term focusing on labor cloud and intelligent subscription services, shifting the value focus from hardware manufacturing to software and service networks. Overall, humanoid robots are in a critical transition period from demonstration to self-learning, and whether they can overcome the triple barriers of control, cost, and algorithms will determine whether they can truly achieve embodied intelligence.

AI × Robotics: The Dawn of the Era of Embodied Intelligence

Traditional automation mainly relies on pre-programming and assembly line control (such as the perception-planning-control DSOP architecture), which can only operate reliably in structured environments. The real world is much more complex and variable, and the new generation of embodied intelligence (Embodied AI) follows a different paradigm: through large models and unified representation learning, enabling robots to possess cross-scenario “understanding-prediction-action” capabilities. Embodied intelligence emphasizes body (hardware) + brain (model) + environment (interaction) dynamic coupling, where the robot is the carrier, and intelligence is the core.

Generative AI belongs to intelligence in the language world, adept at understanding symbols and semantics; embodied intelligence (Embodied AI) belongs to intelligence in the real world, mastering perception and action. The two correspond to “brain” and “body”, representing two parallel main lines of AI evolution. From an intelligence hierarchy perspective, embodied intelligence is a higher level than generative AI, but its maturity is still significantly lagging. LLMs rely on massive internet corpora to form a clear “data → computing power → deployment” closed loop; while robotic intelligence requires first-person, multi-modal, and action-strongly bound data—including remote control trajectories, first-person videos, spatial maps, operation sequences, etc., which naturally do not exist and must be generated through real interactions or high-fidelity simulations, making them scarcer and more expensive. Although simulation and synthetic data help, they cannot replace real sensor-motion experiences, which is why Tesla, Figure, and others must build their own remote operation data factories, and why third-party data annotation factories have emerged in Southeast Asia. In short: LLMs learn from existing data, while robots must “create” data through interaction with the physical world. In the next 5–10 years, the two will deeply integrate in Vision–Language–Action models and Embodied Agent architectures—LLMs will handle high-level cognition and planning, while robots will execute in the real world, forming a bidirectional closed loop of data and action, jointly promoting AI from “language intelligence” to true general intelligence (AGI).

The core technology system of embodied intelligence can be viewed as a bottom-up intelligence stack: VLA (perception fusion), RL/IL/SSL (intelligent learning), Sim2Real (reality transfer), World Model (cognitive modeling), and multi-agent collaboration and memory reasoning (Swarm & Reasoning). Among them, VLA and RL/IL/SSL are the “engine” of embodied intelligence, determining its landing and commercialization; Sim2Real and World Model are key technologies connecting virtual training and real execution; multi-agent collaboration and memory reasoning represent higher-level group and meta-cognitive evolution.

Perception Understanding: Vision–Language–Action Model (VLA)

The VLA model integrates vision (Vision)—language (Language)—action (Action) three channels, enabling robots to understand intentions from human language and translate them into specific operational behaviors. Its execution process includes semantic parsing, target recognition (locating target objects from visual input), and path planning and action execution, thus achieving a closed loop of “understanding semantics—perceiving the world—completing tasks”, which is one of the key breakthroughs of embodied intelligence. Current representative projects include Google RT-X, Meta Ego-Exo, and Figure Helix, showcasing cutting-edge directions such as cross-modal understanding, immersive perception, and language-driven control.

Currently, VLA is still in its early stages, facing four core bottlenecks:

Semantic ambiguity and weak task generalization: Models struggle to understand vague, open-ended instructions;
Unstable alignment between vision and action: Perception errors are amplified in path planning and execution;
Scarcity and lack of unified standards for multi-modal data: High costs of collection and annotation make it difficult to form a scalable data flywheel;
Challenges of time and space axes for long-term tasks: Long task spans lead to insufficient planning and memory capabilities, while large spatial ranges require models to reason about things “beyond the field of view”; current VLA lacks stable world models and cross-space reasoning capabilities.

These issues collectively limit the cross-scenario generalization ability and the scaling process of VLA.

Intelligent Learning: Self-Supervised Learning (SSL), Imitation Learning (IL), and Reinforcement Learning (RL)

Self-Supervised Learning (SSL): Automatically extracting semantic features from perception data, allowing robots to “understand the world”. Equivalent to teaching machines to observe and represent.
Imitation Learning (IL): Quickly mastering basic skills by mimicking human demonstrations or expert examples. Equivalent to teaching machines to act like humans.
Reinforcement Learning (RL): Optimizing action strategies through a “reward-punishment” mechanism, allowing robots to learn and grow through trial and error.

In embodied intelligence (Embodied AI), self-supervised learning (SSL) aims to enable robots to predict state changes and physical laws through perception data, thus understanding the causal structure of the world; reinforcement learning (RL) is the core engine of intelligence formation, driving robots to master complex behaviors such as walking, grasping, and obstacle avoidance through interaction with the environment and trial-and-error optimization based on reward signals; imitation learning (IL) accelerates this process through human demonstrations, allowing robots to quickly acquire action priors. The current mainstream direction is to combine the three to build a hierarchical learning framework: SSL provides the representation foundation, IL imparts human priors, and RL drives strategy optimization, balancing efficiency and stability, together forming the core mechanism of embodied intelligence from understanding to action.

Reality Transfer: Sim2Real—Crossing from Simulation to Reality

Sim2Real (Simulation to Reality) allows robots to complete training in virtual environments and then transfer to the real world. It generates large-scale interaction data through high-fidelity simulation environments (such as NVIDIA Isaac Sim & Omniverse, DeepMind MuJoCo), significantly reducing training costs and hardware wear. The core lies in narrowing the “simulation reality gap”, with main methods including:

Domain Randomization: Randomly adjusting parameters such as lighting, friction, and noise in simulations to improve model generalization;
Physical Consistency Calibration: Using real sensor data to calibrate the simulation engine, enhancing physical realism;
Adaptive Fine-tuning: Rapid retraining in real environments to achieve stable transfer.

Sim2Real is the central link for the landing of embodied intelligence, enabling AI models to learn the closed loop of “perception-decision-control” in a safe, low-cost virtual world. Sim2Real has matured in simulation training (such as NVIDIA Isaac Sim, MuJoCo), but reality transfer is still limited by the Reality Gap, high computing power and annotation costs, and insufficient generalization and safety in open environments. Nevertheless, Simulation-as-a-Service (SimaaS) is becoming the lightest yet most strategically valuable infrastructure in the era of embodied intelligence, with business models including platform subscriptions (PaaS), data generation (DaaS), and security verification (VaaS).

Cognitive Modeling: World Model—The “Inner World” of Robots

World Model is the “inner brain” of embodied intelligence, allowing robots to internally simulate environments and the consequences of actions, achieving prediction and reasoning. It constructs predictable internal representations by learning the dynamic laws of the environment, enabling agents to “rehearse” outcomes before execution, evolving from passive executors to active reasoners. Representative projects include DeepMind Dreamer, Google Gemini + RT-2, Tesla FSD V12, NVIDIA WorldSim, etc. Typical technical paths include:

Latent Dynamics Modeling: Compressing high-dimensional perception into latent state space;
Imagination-based Planning: Virtual trial-and-error and path prediction within the model;
Model-based RL: Replacing the real environment with the world model to reduce training costs.

World Model is at the theoretical forefront of embodied intelligence, representing the core path for robots to evolve from “reactive” to “predictive” intelligence, but it is still limited by challenges such as modeling complexity, unstable long-term predictions, and lack of unified standards.

Collective Intelligence and Memory Reasoning: From Individual Action to Collaborative Cognition

Multi-Agent Systems and Memory & Reasoning represent two important directions for the evolution of embodied intelligence from “individual intelligence” to “collective intelligence” and “cognitive intelligence”. Together, they support the collaborative learning and long-term adaptation capabilities of intelligent systems.

Multi-Agent Collaboration (Swarm / Cooperative RL):

Refers to multiple agents achieving collaborative decision-making and task allocation through distributed or cooperative reinforcement learning in a shared environment. This direction has a solid research foundation, such as the OpenAI Hide-and-Seek experiment demonstrating spontaneous cooperation and strategy emergence among multiple agents, and DeepMind’s QMIX and MADDPG algorithms providing a cooperative framework for centralized training and decentralized execution. Such methods have been validated in scenarios such as warehouse robot scheduling, inspection, and swarm control.

Memory and Reasoning (Memory & Reasoning)

Focuses on enabling agents to possess long-term memory, situational understanding, and causal reasoning capabilities, which are key directions for achieving cross-task transfer and self-planning. Typical research includes DeepMind Gato (a unified perception-language-control multi-task agent) and the DeepMind Dreamer series (imagination-based planning based on world models), as well as open-ended embodied agents like Voyager, which achieve continuous learning through external memory and self-evolution. These systems lay the foundation for robots to possess the ability to “remember the past and predict the future”.

Global Landscape of the Embodied Intelligence Industry: Cooperation and Competition Coexist

The global robotics industry is currently in a period of “cooperation dominance and deepening competition”. China’s supply chain efficiency, the US’s AI capabilities, Japan’s component precision, and Europe’s industrial standards collectively shape the long-term landscape of the global robotics industry.

The United States maintains a lead in cutting-edge AI models and software (DeepMind, OpenAI, NVIDIA), but this advantage has not extended to robotics hardware. Chinese manufacturers have advantages in iteration speed and real-world performance. The US is promoting industrial return through the CHIPS Act and the Inflation Reduction Act.
China has formed a leading advantage in components, automated factories, and humanoid robots through large-scale manufacturing, vertical integration, and policy-driven initiatives, with outstanding hardware and supply chain capabilities. Companies like Unitree and UBTECH have achieved mass production and are extending towards intelligent decision-making layers. However, there is still a significant gap in algorithms and simulation training compared to the US.
Japan has long monopolized high-precision components and motion control technology, with a robust industrial system, but the integration of AI models is still in the early stages, and the pace of innovation is relatively stable.
South Korea stands out in the popularization of consumer-grade robots—led by companies like LG and NAVER Labs—and has a mature and strong service robot ecosystem.
Europe has a well-established engineering system and safety standards, with companies like 1X Robotics remaining active in R&D, but some manufacturing processes have migrated abroad, and the focus of innovation is shifting towards collaboration and standardization.

Robotics × AI × Web3: Narrative Vision and Real Pathways

By 2025, a new narrative of integration between Web3, robotics, and AI will emerge. Although Web3 is seen as the underlying protocol for a decentralized machine economy, its combined value and feasibility at different levels still show significant differentiation:

Hardware manufacturing and service layer is capital-intensive, with weak data closed loops; Web3 can currently only play a supporting role in marginal areas such as supply chain finance or equipment leasing;
Simulation and software ecosystem layer has a higher degree of fit, as simulation data and training tasks can be on-chain for rights confirmation, and intelligent agents and skill modules can also be assetized through NFT or Agent Token;
Platform layer, where decentralized labor and collaboration networks show the greatest potential—Web3 can gradually build a credible “machine labor market” through integrated mechanisms of identity, incentives, and governance, laying the institutional foundation for the future machine economy.

From a long-term vision perspective, the collaboration and platform layer is the most valuable direction for the integration of Web3 with robotics and AI. As robots gradually acquire perception, language, and learning capabilities, they are evolving into intelligent individuals capable of autonomous decision-making, collaboration, and creating economic value. These “intelligent laborers” will truly participate in the economic system, but they still need to overcome four core thresholds of identity, trust, incentives, and governance.

In the identity layer, machines need to have verifiable and traceable digital identities. Through Machine DID, each robot, sensor, or drone can generate a unique verifiable “ID card” on-chain, binding its ownership, behavior records, and permission scope, enabling secure interactions and responsibility delineation.
In the trust layer, the key is to make “machine labor” verifiable, measurable, and priceable. By leveraging smart contracts, oracles, and auditing mechanisms, combined with physical work proofs (PoPW), trusted execution environments (TEE), and zero-knowledge proofs (ZKP), the authenticity and traceability of the task execution process can be ensured, giving economic accounting value to machine behavior.
In the incentive layer, Web3 achieves automatic settlement and value transfer between machines through token incentive systems, account abstraction, and state channels. Robots can complete computing power leasing and data sharing through micropayments, and ensure task fulfillment through staking and penalty mechanisms; with the help of smart contracts and oracles, a decentralized “machine collaboration market” can also be formed without manual scheduling.
In the governance layer, once machines possess long-term autonomous capabilities, Web3 provides a transparent and programmable governance framework: using DAO governance to jointly decide system parameters, and maintaining security and order through multi-signature and reputation mechanisms. In the long run, this will push the machine society towards the stage of “algorithmic governance”—where humans set goals and boundaries, and machines maintain incentives and balance through contracts.

The ultimate vision of the integration of Web3 and robotics: a real-world evaluation network—a “real-world reasoning engine” composed of distributed robots, continuously testing and benchmarking model capabilities in diverse and complex physical scenarios; and the robot labor market—robots executing verifiable real-world tasks globally, obtaining income through on-chain settlement, and reinvesting value into computing power or hardware upgrades.

From a practical pathway perspective, the combination of embodied intelligence and Web3 is still in the early exploration phase, with decentralized machine intelligence economies remaining largely at the narrative and community-driven level. The feasible combination directions in reality mainly reflect in the following three aspects:

(1) Data Crowdsourcing and Rights Confirmation—Web3 encourages contributors to upload real-world data through on-chain incentives and traceability; (2) Global Long Tail Participation—cross-border micropayments and micro-incentive mechanisms effectively reduce data collection and distribution costs; (3) Financialization and Collaborative Innovation—the DAO model can promote the assetization of robots, yield certificates, and settlement mechanisms between machines.

Overall, in the short term, the focus is mainly on data collection and incentive layers; in the medium term, breakthroughs are expected in “stablecoin payments + long-tail data aggregation” and RaaS assetization and settlement layers; in the long term, if humanoid robots achieve large-scale popularization, Web3 may become the institutional foundation for machine ownership, income distribution, and governance, promoting the formation of a truly decentralized machine economy.

Web3 Robotics Ecosystem Map and Selected Cases

Based on three criteria: “verifiable progress, technical openness, and industry relevance”, we have sorted out the current Web3 × Robotics representative projects and categorized them according to a five-layer architecture: Model Intelligence Layer, Machine Economy Layer, Data Collection Layer, Perception and Simulation Infrastructure Layer, Robot Asset Yield Layer. To maintain objectivity, we have excluded projects that are clearly “hitching a ride” or lack sufficient information; if there are omissions, please feel free to point them out.

Model Intelligence Layer

Openmind – Building Android for Robots (https://openmind.org/)

OpenMind is an open-source operating system (Robot OS) for embodied intelligence (Embodied AI) and robot control, aiming to build the world’s first decentralized robot operating environment and development platform. The core of the project includes two major components:

OM1: A modular open-source AI runtime (AI Runtime Layer) built on ROS2, used to orchestrate perception, planning, and action pipelines, serving both digital and physical robots;
FABRIC: A distributed coordination layer (Fabric Coordination Layer) that connects cloud computing power, models, and real robots, allowing developers to control and train robots in a unified environment.

The core of OpenMind is to act as an intelligent intermediary layer between LLMs (large language models) and the world of robots, allowing language intelligence to truly transform into embodied intelligence (Embodied Intelligence), building an intelligent skeleton from understanding (Language → Action) to alignment (Blockchain → Rules).

OpenMind’s multi-layer system achieves a complete collaborative closed loop: humans provide feedback and annotations through the OpenMind App (RLHF data), the Fabric Network is responsible for identity verification, task allocation, and settlement coordination, and OM1 Robots execute tasks and follow the “robot constitution” on the blockchain to complete behavior audits and payments, thus realizing human feedback → task collaboration → on-chain settlement decentralized machine collaboration network.

Project Progress and Reality Assessment

OpenMind is in the early stage of “technology operational, business not landed”. The core system OM1 Runtime has been open-sourced on GitHub, can run on multiple platforms, and supports multi-modal input, achieving task understanding from language to action through a natural language data bus (NLDB), with high originality but still experimental; the Fabric network and on-chain settlement have only completed interface layer design.

In terms of ecology, the project has collaborated with open hardware such as Unitree, Ubtech, TurtleBot, and universities like Stanford, Oxford, and Seoul Robotics, mainly for educational and research validation, with no industrial landing yet. The app has launched a beta version, but the incentive and task functions are still in the early stages.

In terms of business model, OpenMind has built a three-layer ecosystem of OM1 (open-source system) + Fabric (settlement protocol) + Skill Marketplace (incentive layer), currently with no revenue, relying on about $20 million in early financing (Pantera, Coinbase Ventures, DCG). Overall, the technology is leading, but commercialization and ecology are still in the initial stages. If Fabric successfully lands, it is expected to become the “Android of the embodied intelligence era”, but the cycle is long, risks are high, and it is heavily dependent on hardware.

CodecFlow – The Execution Engine for Robotics (https://codecflow.ai)

CodecFlow is a decentralized execution layer protocol based on the Solana network, aiming to provide on-demand operating environments for AI agents and robotic systems, allowing each intelligent agent to have an “Instant Machine”. The core of the project consists of three major modules:

Fabric: A cross-cloud computing aggregation layer (Weaver + Shuttle + Gauge), capable of generating secure virtual machines, GPU containers, or robot control nodes for AI tasks in seconds;
optr SDK: An agent execution framework (Python interface) for creating operable desktops, simulations, or real robots’ “Operators”;
Token Incentives: An on-chain incentive and payment layer connecting computing providers, intelligent agent developers, and automated task users, forming a decentralized computing power and task market.

The core goal of CodecFlow is to build a “decentralized execution base for AI and robot operators”, allowing any intelligent agent to run safely in any environment (Windows / Linux / ROS / MuJoCo / robot controllers), achieving a universal execution architecture from computing power scheduling (Fabric) → system environment (System Layer) → perception and action (VLA Operator).

Project Progress and Reality Assessment

The early version of the Fabric framework (Go) and optr SDK (Python) has been released, allowing isolated computing instances to be launched in web or command-line environments. The Operator market is expected to launch by the end of 2025, targeting AI computing as a decentralized execution layer, primarily serving AI developers, robotics research teams, and automation operation companies.

Machine Economy Layer

BitRobot – The World’s Open Robotics Lab (https://bitrobot.ai)

BitRobot is a decentralized research and collaboration network for embodied intelligence (Embodied AI) and robotics development, initiated by FrodoBots Labs and Protocol Labs. Its core vision is: to define and verify the true contribution of each robotic task through an open architecture of “subnets + incentive mechanisms + verifiable work (VRW)”, with core functions including:

Defining and verifying the true contribution of each robotic task through VRW (Verifiable Robotic Work) standards;
Endowing robots with on-chain identities and economic responsibilities through ENT (Embodied Node Token);
Organizing cross-regional collaboration of research, computing power, equipment, and operators through Subnets;
Achieving incentive decision-making and research governance of “human-machine co-governance” through Senate + Gandalf AI.

Since the release of its white paper in 2025, BitRobot has operated multiple subnets (such as SN/01 ET Fugi, SN/05 SeeSaw by Virtuals Protocol), achieving decentralized remote control and real-world data collection, and launched the $5M Grand Challenges fund to promote global model development research competitions.

peaq – The Economy of Things (https://www.peaq.network)

peaq is a Layer-1 blockchain designed for the machine economy, providing millions of robots and devices with machine identities, on-chain wallets, access control, and nanosecond-level time synchronization (Universal Machine Time) capabilities. Its Robotics SDK enables developers to make robots “machine economy ready” with minimal code, achieving interoperability and interaction across manufacturers and systems.

Currently, peaq has launched the world’s first tokenized robot farm and supports over 60 real-world machine applications. Its tokenization framework helps robotics companies raise funds for capital-intensive hardware and expands participation from traditional B2B/B2C to a broader community level. With a protocol-level incentive pool injected by network fees, peaq can subsidize new device access and support developers, thus forming an economic flywheel that accelerates the expansion of robotics and physical AI projects.

Data Collection Layer

Aimed at solving the scarcity and high cost of high-quality real-world data in embodied intelligence training. It collects and generates human-machine interaction data through various paths, including remote control (PrismaX, BitRobot Network), first-person and motion capture (Mecka, BitRobot Network, Sapien, Vader, NRN), and simulation and synthetic data (BitRobot Network), providing scalable and generalizable training foundations for robot models.

It should be noted that Web3 is not good at “producing data”—in hardware, algorithms, and collection efficiency, Web2 giants far exceed any DePIN project. Its real value lies in reshaping data distribution and incentive mechanisms. Based on “stablecoin payment networks + crowdsourcing models”, it achieves low-cost micropayments, contribution traceability, and automatic profit sharing through permissionless incentive systems and on-chain rights confirmation mechanisms. However, open crowdsourcing still faces challenges of quality and demand closed loops—data quality varies, and there is a lack of effective verification and stable buyers.

PrismaX (https://gateway.prismax.ai)

PrismaX is a decentralized remote control and data economy network for embodied intelligence (Embodied AI), aiming to build a “global robot labor market” where human operators, robotic devices, and AI models co-evolve through an on-chain incentive system. The core of the project includes two major components:

Teleoperation Stack—a remote control system (browser/VR interface + SDK) connecting global robotic arms and service robots, enabling real-time human control and data collection;
Eval Engine—a data evaluation and verification engine (CLIP + DINOv2 + optical flow semantic scoring), generating quality scores for each operation trajectory and settling on-chain.

PrismaX transforms human operational behavior into machine learning data through a decentralized incentive mechanism, constructing a complete closed loop from remote control → data collection → model training → on-chain settlement, achieving a circular economy of “human labor as data assets”.

Project Progress and Reality Assessment

PrismaX has launched a beta version (gateway.prismax.ai) in August 2025, allowing users to remotely control robotic arms to perform grasping experiments and generate training data. The Eval Engine has been running internally; overall, PrismaX has a high degree of technical realization, clear positioning, and is a key intermediary layer connecting “human operation × AI model × blockchain settlement”. Its long-term potential is expected to become a “decentralized labor and data protocol of the embodied intelligence era”, but in the short term, it still faces scaling challenges.

BitRobot Network (https://bitrobot.ai/)

BitRobot Network collects multi-source data such as video, remote control, and simulation through its subnets. SN/01 ET Fugi allows users to remotely control robots to complete tasks, collecting navigation and perception data in a “real-life Pokémon Go-style” interaction. This approach has led to the birth of the FrodoBots-2K dataset, one of the largest open-source human-robot navigation datasets currently used by institutions such as UC Berkeley RAIL and Google DeepMind. SN/05 SeeSaw (Virtual Protocol) crowdsources first-person video data on a large scale in real environments through iPhones. Other announced subnets, such as RoboCap and Rayvo, focus on collecting first-person video data using low-cost physical devices.

Mecka (https://www.mecka.ai)

Mecka is a robotics data company that crowdsources first-person video, human motion data, and task demonstrations through gamified mobile collection and custom hardware devices to build large-scale multi-modal datasets supporting the training of embodied intelligence models.

Sapien (https://www.sapien.io/)

Sapien is a crowdsourcing platform centered on “human motion data driving robot intelligence”, collecting human action, posture, and interaction data through wearable devices and mobile applications for training embodied intelligence models. The project aims to build the world’s largest human motion data network, making human natural behavior the foundational data source for robot learning and generalization.

Vader (https://www.vaderai.ai)

Vader collects first-person video and task demonstrations through its real-world MMO application EgoPlay: users record daily activities from a first-person perspective and receive VADER rewards. Its ORN data pipeline can convert raw POV footage into privacy-processed structured datasets, including action labels and semantic descriptions, which can be directly used for humanoid robot strategy training.

NRN Agents (https://www.nrnagents.ai/)

A gamified embodied RL data platform that crowdsources human demonstration data through browser-based robot control and simulation competitions. NRN generates long-tail behavior trajectories through “competitive” tasks for imitation learning and continuous reinforcement learning, serving as scalable data primitives supporting sim-to-real strategy training.

Comparison of Embodied Intelligence Data Collection Layer Projects

Perception and Simulation (Middleware & Simulation)

The perception and simulation layer provides the core infrastructure connecting the physical world with intelligent decision-making, including capabilities for localization, communication, spatial modeling, and simulation training, forming the “intermediate skeletal structure” for building large-scale embodied intelligence systems. Currently, this field is still in the early exploration stage, with various projects forming differentiated layouts in high-precision localization, shared spatial computing, protocol standardization, and distributed simulation, without a unified standard or interoperable ecosystem emerging yet.

Middleware and Spatial Infrastructure (Middleware & Spatial Infra)

The core capabilities of robots—navigation, localization, connectivity, and spatial modeling—constitute the key bridge connecting the physical world with intelligent decision-making. Although broader DePIN projects (Silencio, WeatherXM, DIMO) have begun to mention “robots”, the following projects are most directly related to embodied intelligence.

RoboStack – Cloud-Native Robot Operating Stack (https://robostack.io)

RoboStack is a cloud-native robot middleware that achieves real-time scheduling, remote control, and cross-platform interoperability of robot tasks through RCP (Robot Context Protocol), providing cloud simulation, workflow orchestration, and agent access capabilities.

GEODNET – Decentralized GNSS Network (https://geodnet.com)

GEODNET is a global decentralized GNSS network providing centimeter-level RTK high-precision positioning. Through distributed base stations and on-chain incentives, it offers real-time “geographic reference layers” for drones, autonomous driving, and robots.

Auki – Posemesh for Spatial Computing (https://www.auki.com)

Auki has built a decentralized Posemesh spatial computing network, generating real-time 3D environmental maps through crowdsourced sensors and computing nodes, providing shared spatial references for AR, robot navigation, and multi-device collaboration. It is a key infrastructure connecting virtual spaces with real scenes, promoting the integration of AR × Robotics.

Tashi Network — Robot Real-Time Mesh Collaboration Network (https://tashi.network)

A decentralized real-time mesh network achieving sub-30ms consensus, low-latency sensor exchange, and multi-robot state synchronization. Its MeshNet SDK supports shared SLAM, group collaboration, and robust map updates, providing a high-performance real-time collaboration layer for embodied AI.

Staex — Decentralized Connectivity and Telemetry Network (https://www.staex.io)

A decentralized connectivity layer originating from the German telecommunications R&D department, providing secure communication, trusted telemetry, and device-to-cloud routing capabilities, enabling reliable data exchange and collaboration across different operators for robot fleets.

Simulation and Training Systems (Distributed Simulation & Learning)

Gradient – Towards Open Intelligence (https://gradient.network/)

Gradient is an AI laboratory building “Open Intelligence”, dedicated to achieving distributed training, inference, validation, and simulation based on decentralized infrastructure; its current tech stack includes Parallax (distributed inference), Echo (distributed reinforcement learning and multi-agent training), and Gradient Cloud (AI solutions for enterprises). In the robotics direction, the Mirage platform provides distributed simulation, dynamic interactive environments, and large-scale parallel learning capabilities for embodied intelligence training, accelerating the landing of world models and general strategies. Mirage is exploring potential collaboration with NVIDIA regarding its Newton engine.

Robot Asset Yield Layer (RobotFi / RWAiFi)

This layer focuses on transforming robots from “productive tools” into “financializable assets”, constructing the financial infrastructure of the machine economy through asset tokenization, yield distribution, and decentralized governance. Representative projects include:

XmaquinaDAO – Physical AI DAO (https://www.xmaquina.io)

XMAQUINA is a decentralized ecosystem providing global users with high liquidity participation channels to top humanoid robots and embodied intelligence companies, bringing opportunities that were previously only available to venture capital institutions on-chain. Its token DEUS serves as both a liquidity index asset and a governance vehicle for coordinating treasury allocation and ecological development. Through the DAO Portal and Machine Economy Launchpad, the community can participate in the tokenization and structured on-chain participation of machine assets, jointly owning and supporting emerging Physical AI projects.

GAIB – The Economic Layer for AI Infrastructure (https://gaib.ai)

GAIB aims to provide a unified economic layer for GPU and robotic entities, connecting decentralized capital with real AI infrastructure assets, constructing a verifiable, composable, and yield-generating intelligent economic system.

In the robotics direction, GAIB does not “sell robot tokens”, but instead achieves the on-chain financialization of robot devices and operational contracts (RaaS, data collection, remote operation, etc.), transforming “real cash flow → on-chain composable yield assets”. This system encompasses hardware financing (financing leasing/staking), operational cash flow (RaaS/data services), and data flow revenue (licensing/contracts), making robot assets and their cash flows measurable, priceable, and tradable.

GAIB uses AID/sAID as the settlement and yield vehicle, ensuring robust returns through structured risk control mechanisms (over-collateralization, reserves, and insurance), and long-term access to DeFi derivatives and liquidity markets, forming a financial closed loop from “robot assets” to “composable yield assets”. The goal is to become the economic backbone of intelligence.

Summary and Outlook: Real Challenges and Long-Term Opportunities

From a long-term vision perspective, the integration of robotics × AI × Web3 aims to construct a decentralized machine economy system (DeRobot Economy), promoting embodied intelligence from “standalone automation” to “verifiable, settleable, and governable” networked collaboration. Its core logic is to form a self-circulating mechanism through “Token → Deployment → Data → Value Redistribution”, enabling robots, sensors, and computing nodes to achieve rights confirmation, transactions, and profit sharing.

However, from the current stage, this model is still in the early exploration phase, far from forming stable cash flows and scalable business closed loops. Most projects remain at the narrative level, with limited actual deployment. Robotics manufacturing and operation are capital-intensive industries, and relying solely on token incentives is insufficient to support infrastructure expansion; while on-chain financial designs have composability, they have yet to solve the risk pricing and yield realization issues of real assets. Therefore, the so-called “machine network self-circulation” remains idealized, and its business model requires real-world validation.

Model Intelligence Layer (Model & Intelligence Layer) is currently the most valuable long-term direction. Open-source robot operating systems represented by OpenMind attempt to break closed ecosystems and unify multi-robot collaboration and language-to-action interfaces. Its technical vision is clear, and the system is complete, but the engineering workload is enormous, and the verification cycle is long, and it has yet to form industry-level positive feedback.
Machine Economy Layer (Machine Economy Layer) is still in the preparatory stage; in reality, the number of robots is limited, and the DID identity and incentive network are still difficult to form a self-consistent cycle. Currently, we are far from a “machine labor economy”. Only when embodied intelligence achieves large-scale deployment will the economic effects of on-chain identity, settlement, and collaboration networks truly manifest.
Data Collection Layer (Data Layer) has the lowest threshold, but is currently the most commercially viable direction. The data collection for embodied intelligence requires high spatiotemporal continuity and action semantic accuracy, determining its quality and reusability. Balancing “crowdsourcing scale” and “data reliability” is the core challenge of the industry. PrismaX’s approach of first locking in B-end demand and then distributing tasks for collection and verification provides a replicable template to some extent, but the ecological scale and data trading still require time accumulation.
Perception and Simulation Layer (Middleware & Simulation Layer) is still in the technical verification phase, lacking unified standards and interfaces, and has yet to form an interoperable ecosystem. The efficiency of Sim2Real is limited, and simulation results are difficult to standardize for transfer to real environments.
Asset Yield Layer (RobotFi / RWAiFi) sees Web3 mainly playing a supporting role in supply chain finance, equipment leasing, and investment governance, enhancing transparency and settlement efficiency, rather than reshaping industrial logic.

Of course, we believe that the intersection of robotics × AI × Web3 still represents the starting point of the next generation of intelligent economic systems. It is not only a fusion of technical paradigms but also an opportunity for reconstructing production relations: when machines possess identity, incentives, and governance mechanisms, human-machine collaboration will shift from localized automation to networked autonomy. In the short term, this direction remains primarily narrative and experimental, but the institutional and incentive frameworks it lays are paving the way for the economic order of future machine societies. From a long-term perspective, the combination of embodied intelligence and Web3 will reshape the boundaries of value creation—making intelligent agents truly verifiable, collaborative, and yield-generating economic entities.

Disclaimer: As a blockchain information platform, the articles published on this site only represent the personal views of the authors and guests, and do not reflect the position of Web3Caff. The information within the articles is for reference only and does not constitute any investment advice or offer, and please comply with the relevant laws and regulations of your country or region.

☞ To explore Web3 industry innovation with in-depth perspectives, please subscribe to the official WeChat account of Web3Caff ↓

☞ For more professional reports and content, Web3 frontline elites please subscribe to Web3Caff Research ↓

Related posts

Leave a Comment Cancel reply