Reflections and Practices on Physical AI Simulation and the Application of Robots in Industrial Scenarios

In the current wave of transformation driven by general intelligence, science, technology, productivity, and development models are undergoing comprehensive and fundamental changes. At the forefront of technological diffusion, entrepreneurship is one of the most efficient paths to drive innovation.

We will periodically interview young founders accelerated by Qiji Entrepreneurship Camp to understand the worldview of technology entrepreneurs, their engineering practice experiences, product innovation attempts, and explorations of business applications from their perspectives. More importantly, we will share their stories and experiences during the early stages of their entrepreneurial journeys.

In this article, we chatted with Nie Kaixuan, the founder of Songying Technology, an alumnus of the Qiji 2024 Autumn Entrepreneurship Camp. He systematically shared his thoughts and practices on “how AI can truly land in the physical world,” including the trend of industrial automation, the underlying logic of AI landing in the physical world, the limitations of the VLA route, and how to choose the right entry point and survive crises when resources are limited.

The outline of this article:

  • Choosing the direction of entrepreneurship: Is NVIDIA Omniverse a real opportunity or just storytelling?

  • The first step in vision practice: Choosing scenarios, from “what can be done” to “what is desired”

  • Physical simulation and VLA: Speculations on the path of AI entering the physical world

  • Robot simulation landing: Multidimensional challenges of computational accuracy, perception, and rendering

  • Choices in crises: Shedding noise and returning to faith after traversing the darkest moments

  • Advice for young technology entrepreneurs

If you like and look forward to seeing more cutting-edge entrepreneurial series content, please let us know (click the little heart at the end of the article).

Reflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosOn March 10, 2023, Nie Kaixuan encountered the darkest moment since starting his business.The Silicon Valley Bank declared bankruptcy, and the company had just been established for a year and a half, with several million dollars in funding frozen. Just the day before, Songying Technology had opened a new office in Shenzhen and announced expansion plans.“Can the team still hold on? Can the company still operate? Where to find money?” A series of questions followed. After failing to get help from old investors, Nie Kaixuan went three days and nights without sleep, even beginning to simulate the worst-case scenario: shutting down the company.At that time, choosing to exit was understandable. But a phrase, “I can’t let go, I need an explanation,” prompted him, who was usually reluctant to ask for money, to call two friends and raise enough “lifeline money” to support six months. Fortunately, in the following two months, the Federal Reserve and the U.S. Treasury took over Silicon Valley Bank and eventually returned all deposits.Nie Kaixuan is not the gifted entrepreneur people often imagine—he is not a star youth from a top laboratory, does not have prestigious conference papers, nor did he become famous for breaking through the security defenses of major companies’ models.Talking with him, one can feel a rare temperament: calm, restrained, and meticulous thinking, like an engineer layering reasoning, building a rigorous and clear logical system face-to-face. This temperament comes not only from the three days of facing “life and death” but also from over a decade of honing in the industry.Starting from doing “marginal” technology in foreign companies, to joining Huawei, from a basic R&D engineer and architect to the deputy commander and general manager; from being obscure to participating in and organizing multiple product projects from 0 to 1, such as Huawei Cloud Phone, Cloud Gaming, and GPU—Nie Kaixuan frankly stated that his experience at Huawei made him understand for the first time how core technology teams learn, develop, and drive R&D and iteration through the market. This also made him truly believe that “as long as the U.S. can do it, the Chinese team can do it too.”This belief is the starting point for him to establish Songying and his determination to create a Chinese version of the Omniverse physical AI simulation system, supporting him to maintain strategic determination in the face of crises, external pessimism, and even internal doubts.In his view, the trend of industrial automation and the intelligence of the physical world is irreversible. The future requires a new infrastructure to plan, simulate, control, and manage a vast physical system composed of countless AIs and robots. He believes that for AI to truly enter workshops, fly at low altitudes, and integrate into daily life, it must first learn the “language of physics”—accurately simulating complex laws such as gravity, collision, material deformation, and sensor feedback, and secondly, improving solution accuracy to achieve “physical correctness” and optimize computational efficiency.To achieve the above vision, Songying’s choice is to build a robot “playground” through mathematics and code, allowing them to learn to couple with the physical environment in a high-precision simulation environment and directly translate what they learn into action execution, rather than relying on language descriptions to understand tasks. This path is difficult and still in its early stages, but Nie Kaixuan said, “It’s worth a try.” Now, Songying’s self-developed physical AI simulation system ORCA has already taken a key step on this path. The company has established deep cooperation with dozens of central state-owned enterprises, research institutes, embodied intelligent model companies, and leading GPU manufacturers.At the end of the interview, we asked him to give a piece of advice to young technology entrepreneurs. He smiled and said, “Eric (co-founder and CFO Guo Siqiang) brought me to Qiji, and I was shocked to see so many young faces. Their courage and knowledge are stronger than mine; the only thing I might have is experience in ‘upward management.’ Opportunities always belong to the younger generation.” He paused and added, “Some problems, if faced earlier, might not be problems today.”After ten years of work before starting a business, Nie Kaixuan feels it is “not brave enough.”But looking back at his choices—facing doubts, risks, and the worst-case scenarios, he did not flinch. Isn’t that a form of courage?Below is the dialogue between Qiji and Nie Kaixuan, edited:

Choosing the direction of entrepreneurship:

Is NVIDIA Omniverse a real opportunity or just storytelling?

Qiji: From the beginning, Songying Technology aimed to create a Chinese Omniverse, which is a grand goal.

Nie Kaixuan:When Jensen Huang first proposed the macro concept of intelligence in the physical world, it indeed felt very ethereal, so much so that people thought it could only be storytelling. The physical world that Omniverse aims to construct encompasses all aspects, including autonomous driving, warehousing logistics, and intelligent processes that will be highly intelligent to serve humanity. However, at that time, my project at Huawei was interrupted due to supply chain issues. Unable to work on hardware chips, I wanted to see if the upper-level software could be developed, so I spent a long time studying it, including strategy, product architecture, and the relationships between each module.

How to build such a complex system? Today, we see many ways to describe the physical world, and some algorithms simulate phenomena in the physical world, such as different lighting, acceleration, physical properties, temperature, humidity, etc. These data come from different teams, different software and hardware, and there is no unified standard. The first step that Jensen Huang proposed was to unify the data, using USD (Universal Scene Description) data files to represent the physical world, unifying the discrete software, hardware, and various systems and management systems that describe the physical world.

Qiji: The concept of USD was also quite fresh at that time. How did you deduce this matter?Nie Kaixuan: USD is the key to my belief that Jensen Huang was not speaking casually. After careful study, I found that USD has the potential to become the language for precise description of the 3D physical world in the future. Its design is very clever, based on a layered rather than packaged concept.For example, to describe a pair of headphones, one might say they are gray, oval-shaped, and can be opened; these descriptions can be layered on top of each other. As the scene expands, the representations of other objects in the environment can also be layered without interference. It’s like how the Qin state unified the country by first unifying the script; no matter how large the scene, it can ultimately be described clearly. At that time, NVIDIA also did a lot of work, collaborating with major platform providers and industrial software design companies like Apple, Pixar, and Autodesk to establish such data standards.So under this grand vision, he described a path, and with USD as a starting point, we were willing to give it a try, to deepen our understanding and exploration along this direction. Even if there are many technical challenges to solve, and even if we and NVIDIA might end up going down the wrong path, at least it currently represents a seemingly feasible technical route. In fact, that period was relatively happy for us, as we had a benchmark to guide us across the river.Qiji: Following the vision of Omniverse, what opportunities did you see that truly drove you to decide to start a business in this area?Nie Kaixuan: The opportunity to push AI from the internet into the physical world is immense. NVIDIA’s Omniverse aims to extend its reach into areas it has never ventured into before—the physical industry. Those places have never purchased NVIDIA GPUs; a large factory might only have an IT department with two people, two servers, and two switches. A factory can generate several hundred million in output annually, but its IT equipment might not be replaced for five years. Therefore, it has been difficult for IT in the industrial sector to create value; Jensen Huang could only sell to the internet.The internet has indeed made him a lot of money, and large model applications are still in the internet space, such as chatbots, writing articles, and generating images. However, the physical entity industry is an industry that is twenty to thirty times larger than the internet. This industry has a long-term trend towards automation.In the early days, factory assembly lines and processes were designed for humans to operate and refine. Now, in many factories, humans are a rare element, and some factories have almost no humans at all. But the problem is that such a vast industrial system and the physical world ultimately rely on machines to complete tasks. Who will handle planning, design, simulation, scheduling, management, and optimization? In the past, it relied on factory managers and workshop directors; in the future, it will depend on physical AI systems. Who will play this role? At that time, there was no one, and Jensen Huang saw this market gap, which is a market and direction I also agree with.Qiji: But the process from digitalization and intelligence to automation is a long one.Nie Kaixuan: It is indeed a long and gradual process, but it is irreversible. Suppose a production line has 600 robotic arms, plus hundreds of transport robots, totaling over a thousand. Such a complex and precise system cannot be programmed and flexibly scheduled by a single computer team; there must be an AI system to take over.When we talk with manufacturing business owners, they are all thinking about pursuing efficiency, and the assembly line is highly automated. In the process of gradually validating the direction, we have indeed discovered many real needs from the industry. From a human perspective, workers on the production line face significant pressure, doing repetitive and high-intensity work every day. Omniverse and ORCA are essentially doing the same thing: gradually transforming the originally discrete industrial systems, including human-machine hybrid systems, into a production logic completely controlled by systems and robots. At least for now, it will not be overturned.Currently, many factories are designed around humans, leaving a lot of public space for walking and loading/unloading; the future may not require this. From an efficiency perspective, the entire factory will need to be redesigned; the future may not be a long assembly line but a multi-story building. This building can be seen as a large robot that can adjust its layout and define many products as needed. Even if it is still an assembly line, it may not run horizontally but vertically. It may take another ten to twenty years to reach this point.

The first step in vision practice:

Choosing scenarios, from “what can be done” to “what is desired”

Qiji: After aiming at the grand vision of Omniverse, how did you choose the first scenario to validate?

Nie Kaixuan: The system is large, with many modules, and needs to be built step by step. The first thing we developed was high-definition rendering + 3D data collaborative design, based on the USD data language, achieving high-speed interoperability between multiple software. For example, within game, film, and animation teams, there are different roles such as artists, engine developers, and motion capture specialists, and collaboration among them is challenging because they use different software and systems. This often leads to data loss after loading, resulting in rework. In industrial digital system design, many companies in the industry chain are trying to solve this problem, but due to lengthy processes, it is difficult to achieve unification. A lot of costs are consumed in frequent rework. If rework is avoided and simplifications are made, forced deletions of lost data will lower product quality, leading to poor user feedback.At that time, our USD-based data collaboration pipeline aimed to connect various types of industrial software, scheduling systems, and multiple data sources. This way, different types of roles can work on the same level, and data will not be lost, which is very valuable for them.

Qiji: Two years ago, the business scenario shifted to robots. What changes did you experience in between?

Nie Kaixuan: Initially, the revenue from the data collaboration business was not high, but our idea was that if we couldn’t raise funds, we needed to find a scenario that could sustain ourselves, secure funding to support team R&D, and aim for higher goals. We also prepared contingency plans and developed several clients, receiving positive feedback. Without such a guaranteed scenario, it would negatively impact team stability and investor confidence. In fact, all companies face three circles: “what they want to do,” “what they should do,” and “what they can do.” Often, what you have and what you can do is the most important.In June 2023, we first applied the system to robot-related businesses because the system was not yet complete at that time. During this period, if we didn’t find a profitable business, what would happen if we ran out of investor funds, or if financing visibly bottomed out in the next 6 to 12 months?

Qiji: How did you expand into this new scenario with robots?

Nie Kaixuan: At the time of the transition (two years ago), humanoid robots were not popular. We were still working on simulations for factory and warehouse logistics AGVs, AMRs, and collaborative robotic arms. The rise of humanoid robots in 2024 was indeed a variable. The development of AI, especially large models, raised everyone’s expectations for this matter. Otherwise, according to our plan, we would gradually move from AGVs and AMRs to single-arm, then to dual-arm collaboration, and even possibly multi-arm collaboration in the future.After humanoid robots became popular, everyone began asking, “Can your system be used for humanoid robots? Can you create a case for me to see?” At that time, people believed that humanoid robots were the most challenging due to their high degrees of freedom and multiple joints. If we could create a humanoid robot, it would not reflect our technical level to create other industrial robots. However, industrial systems are more complex; we only opened a small module of the entire system to humanoids, which instead familiarized the market with our technology.

Qiji: From your perspective, how do you understand that industrial systems are more complex than humanoid robots?

Nie Kaixuan: A factory is essentially a massive system engineering project. To produce a complex product, the processes are very intricate. Large factories, like BYD, Foxconn, and TSMC, have a set of system engineering support. Robots may only be a very small part of that system, ultimately needing to return to this system for scheduling and management. Robots are also engineering, but they have far fewer components than cars. Although better technologies, such as integrated die-casting, have reduced the number of components in cars, there were at least thousands in the past, while robots only have over 200 components, including screws, motors, reducers, and some joints.The system engineering theory proposed by Qian Xuesen is about reasonably organizing and managing various raw materials, equipment, and personnel to form a system organically. The output of this system can serve as input for the next system, processed and coordinated layer by layer, ultimately forming a large, efficient whole to achieve complex goals. For example, building a nuclear power plant involves dozens of thousands of modules; how to orderly assemble these modules to ensure collaborative work is crucial. Many systems are highly coupled, and any problem in one link can affect the entire system.Teams capable of handling large system engineering projects are extremely rare. Many people can develop applications or even algorithms, but can they create a HarmonyOS? Currently, only Huawei has done it in China; others are merely doing secondary development on Android.

Physical Simulation and VLA:

Speculations on the Path of AI Entering the Physical World

Qiji: Is there a different requirement for simulation technology when shifting from game scenarios to robots?Nie Kaixuan: One of the most challenging aspects is that, for example, in game development, we previously focused on human-machine interaction, centering on humans. However, when developing simulation training for robots or industrial simulation systems, the audience is machines. It is essentially about establishing a system through communication between machines. The architecture and logic of human-machine interaction differ from those of machine-to-machine interaction.Thus, we often say that this system is essentially a game designed for robots, allowing them to learn to perceive the environment through some peripherals, understand, think, and learn new skills during interactions, and then iterate repeatedly. In other words, we are transforming the training of humans into the training of machines. This game does not require a screen; as long as the sensors can see, it suffices.Since the audience has changed significantly, the thinking must also shift. For example, the previous human-machine interaction interface needed to be simple enough, with clear cases, and the images seen by humans needed to be realistic enough to deceive the naked eye. But now, how can we deceive robots equipped with LiDAR, cameras, and various sensors? The technical solutions and methods have changed, representing a systemic change that poses significant challenges to the team.Qiji: When switching to the robot scenario for simulation, what needs to be rethought from the ground up?Nie Kaixuan: Unlike games, where having basic interactions and good graphics can sell, robots do not require these; instead, we need to assess the “realism” of the scene. This realism is reflected in accuracy, such as whether there are tables, chairs, and benches in the environment, and how precise the dimensions and positions of these objects are.Robots are equipped with sensors, such as LiDAR, to model and depict the environment. They do not need to see a world identical to that of humans. In fact, what humans see does not equate to the real world. Science has shown that many objects do not emit light; what we see is often a result of reflection, projection, or processing by sensory systems. Seeing is not necessarily believing; it depends on the light-sensing elements used. Humans perceive light through the lens and retina, while robots use different light-sensing devices, so the world they “see” will naturally differ. Since the dimensions of the world they perceive have changed, the information and data they receive will also change accordingly.Our task is to serve robots by building a physical world that they can understand and perceive through their sensors. Based on this, they will generate a “portrait” of this world, which will inevitably differ from what we humans see.However, we are not yet at this stage; many existing technologies still follow past paths, merely adding some new means. For example, the visual system is still needed; people still capture images through high-definition cameras, using the RGB cameras on robots to take pictures, and then using this data to train VLA models. This essentially requires machines to first understand the images before comprehending them. However, the final robots may not necessarily use this method to recognize the world, so VLA cannot yet be considered the ultimate solution for robot perception of the world.Qiji: What is your view on VLA, and why do you not consider it the ultimate solution for robot perception of the world?Nie Kaixuan: VLA itself is inefficient. If you have seen VLA experiments, the actions are very slow. Why is there an L (language) in the middle? It is because the current language models are doing well, so it requires an additional layer to translate what is seen into language and then into actions. But humans do not do this; they can identify objects at a glance without needing translation.In fact, the key issue is that we have not truly understood what robots need and how they should perceive the world. We often imagine based on human experience, thinking that “humans understand the world through vision,” so robots should do the same. But the reality is that we still cannot make vision directly translate into actions, so we describe visual content through language, which is then converted into action commands for robots to execute.This is essentially a compromise when technology cannot meet ideal goals, which is why paths like VLA exist. In my view, VLA is not an ideal solution. It is not only slow but also language itself is not a precise way to describe the physical world.Humans can construct an imagined scene in their minds based on very rough language descriptions, even simulating future scenarios. Human imagination capabilities are very strong, but robots or entire AI models do not yet possess such capabilities.Qiji: Do you think a simulator can achieve direct translation from vision to action?Nie Kaixuan: The reason we are pushing to establish a multi-physical simulator is to use mathematical methods or solving methods to give robots a certain simulation capability to couple with the environment, and then this capability can be directly translated into physical actions, rather than necessarily going through an intermediate layer of language description. But this is still a work in progress; I have only thought of a pathway that may not be correct, and we may not achieve what NVIDIA or others have accomplished.I studied physics myself, so I tend to consider physical elements more. In contrast, my understanding of language models is not deep. The structure of generative large models is based on probabilistic solving, while the physical world, especially industrial production, is a high-precision computational process that cannot tolerate illusions. Models like VLA take clear and precise information and convert it into a model with illusions, then use this model to manipulate something that requires precision.This, in my view, is essentially a dimensional reduction rather than an elevation; it not only fails to meet the expected goals but may lead us further astray. Of course, science is never a straight line; it always involves detours and even wrong directions, which is necessary. At least it allows us to see the embryonic form of a self-operating environment in the future.Qiji: AI entering the physical world has not yet converged on different technical routes. What is your view on the potential of different routes?Nie Kaixuan: Currently, the technical paths have indeed not converged, and there are opportunities in each. For example, differentiable simulation, Dr. Zhou’s Genesis project is also attempting this new approach. From the publicly released articles and codes, although there is still a distance from the target effect, it indeed describes a path achieved through generative and differentiable methods.Each technical path has its own characteristics, but a common point is that they cannot deviate from the laws of physics. Regardless of the approach taken—differentiable, generative, pure computational simulation, or pure image-based simulation like Sora—the primary task is to adhere to the laws of physics. The next step is to pursue higher solution accuracy, i.e., physical correctness; the next step is to consider efficiency and how to improve computational efficiency. Only based on these three points can new technologies have the opportunity to be applied in the physical world; otherwise, artificial intelligence will find it difficult to be applied to the physical world. This poses a significant challenge to the underlying architecture.

Robot Simulation Landing:

Multidimensional Challenges of Computational Accuracy, Perception, and Rendering

Qiji: The challenge of sim-to-real is often mentioned. What are the difficulties?Nie Kaixuan: Essentially, each controller or machine device has its own solving efficiency and accuracy. The reasons for the sim-to-real problem are numerous, not just that the simulation solving is unrealistic. For example, if the robot’s gait is set to A during simulation, the actual walking may vary. This variation may stem from the robot’s CPU processing speed being insufficient, leading to inadequate solving speed. Even if a high-frequency CPU is replaced, it may not solve the problem, as the motor’s processing speed may also be insufficient, or the motor may overheat when driving heavy objects at high speeds, affecting its performance and leading to a decrease in speed.Essentially, this is a relationship of multi-physical field coupling. To eliminate this gap, the simulation must try to simulate all the various elements used in the robot’s body. As fine-tuning and data accuracy improve, the reference dimensions will increase, and the gap will gradually narrow. Simulation cannot completely replace reality; it is essentially a very good efficiency enhancement and supplement to reality, but the final fine-tuning still needs to be tested on the physical body. However, this does not mean that simulation is useless. Because the system is too large, it cannot simulate the desired changes through manual observation; it must be solved through computer calculations. As computing power increases, solving speed becomes faster, and solving dimensions become higher, simulation will become increasingly realistic, which is closely related to the underlying architecture.Qiji: Regarding the issues just mentioned, what do you think are the top three challenges that need to be addressed in the field?Nie Kaixuan: There are indeed many problems that need to be solved. The first is the issue of computational accuracy. We currently use high-precision simulation. For example, NVIDIA’s Omniverse and its subsystem Isaac Sim initially used their own PhysX engine. PhysX is mainly used for games and film animation, pursuing visual similarity rather than absolute accuracy, so its operational efficiency is high, but its solving accuracy is weak. However, because Isaac’s functionality is too comprehensive, everyone has to continue using it, only using other solvers for specific links.NVIDIA is also working hard to solve this problem, such as integrating MuJoCo so that customers can complete all capabilities in one software without needing to export to other software for simulation. They released the integrated version in March this year, but it has not yet gone live, while we completed the integration last August, nearly a year ahead of NVIDIA, thus gaining many customers, as they could only do rough simulations while we could do precise simulations.The second is sensors. The accuracy of sensors is not high. For example, the cameras from major suppliers have inherent deviations. The imaging effect of a camera depends on each company’s algorithm, and optimizing corner pixels is also a set of algorithms, so it also has accuracy issues.Then there are batteries and motors. The nominal and actual performance of motors differ. For instance, when buying an electric vehicle, the theoretical range may be 680 kilometers, but the actual range might only be 450 kilometers. The parameters of motors also have deviations. There are indeed many problems that need to be solved, but they can all be addressed.Qiji (interview observer): What is the maximum scale or complexity that Songying’s simulation can currently achieve? In the process of expanding the scale, has the team encountered any unexpected issues?Nie Kaixuan: There are indeed many challenges in this regard; we have been stepping on pits and filling them in along the way. In terms of scale, we can now create a complete digital factory. In September 2023, we released a factory demo for the first time, capable of accommodating about 200 robots collaborating within the same factory, including robotic arms, AGVs, packing vehicles, and transport equipment, covering an area of about 20,000 square meters. Since then, we have not made breakthroughs in scale; last year, we invested a lot of personnel and energy into the humanoid robot track.However, we have now received more demands from the manufacturing industry, including electrical equipment, components, automotive factories, and nuclear power production equipment factories. The volume of project files is larger than before, and the processes are more complex, with over 100 processes to simulate. This involves a significant aspect of graphics, as visual perception relies heavily on the realism of images. However, in graphics, solving triangular patches places a great burden on rendering engines. Although everyone is pushing for ray tracing and advanced features like Unreal Engine’s Lumen and Nanite, they are essentially a simplified computational process.To enable robots to perceive the world, they may not need to automatically decrypt and perform large-scale solving. This method is designed to make it appear to the human eye that no simplification has occurred, but in reality, 90% of the triangular patches are simplified during the underlying computation. If it is for machines, this simplification may not be acceptable; if the data volume becomes too large, it may not be manageable.Qiji (interview observer): Is the challenge of expanding simulation scale mainly focused on rendering?Nie Kaixuan: Yes, rendering is currently the main issue. Physical solving can already achieve multi-GPU parallel computing, but multi-GPU parallel rendering is still very immature. I believe this can be a direction; many people have tried it before. However, including our own distributed computing efforts, we have not commercialized it; we have only done demos in the lab. If this can be broken through, future simulations will not be limited. The current limitation is the processing capacity of triangular patches; the larger the model that can be processed, the larger the simulation can be conducted.

Choices in Crises:

Shed Noise and Return to Faith After Traversing the Darkest Moments

Qiji: You mentioned that humanoid robots are a variable. If we set aside external factors like environment and heat, what scenario would you most like to create?Nie Kaixuan: I still prefer industrial and manufacturing scenarios. I will define the form of robots based on the needs of the industrial manufacturing sector. They do not necessarily have to be humanoid, with two legs and two arms; being overly anthropomorphic may not work in industrial applications. For example, shipbuilding is a non-standard scenario; unlike automotive assembly lines that can stretch for a kilometer with standardized production at different stages, ships are built one by one, with large modular sections that cannot be assembled on an assembly line, and workers must crawl inside the ship to operate. The home appliance industry and textile industry also differ from the automotive industry.Therefore, I always believe that we should first think about what scenarios and industries to work in before deciding what form the robots should take, rather than first creating humanoid robots and then finding what they can do. In different industries, the forms of machines (or humans) will certainly vary. Some machines may have general applicability, while others may not. Currently, everyone is pursuing a kind of commercial standardization, but in fact, it is not so.My ultimate goal is to make humans less tired; generations have been too exhausted, competing in learning and work. Robots should produce all the necessities of life, allowing us to do what we want. I am currently raising my child to encourage him to play with different things. I believe that when he grows up, robots will have produced everything, and he won’t have to go through what we did, where learning was the only focus.Qiji: What has been the most challenging choice you have faced since starting your business?Nie Kaixuan: The first should be the bankruptcy of Silicon Valley Bank in March 2023. At that time, we had just been in business for over a year, and the funds we raised were all in Silicon Valley Bank. I never imagined that the company could close due to a bank’s bankruptcy.In theory, if you felt that entrepreneurship was exhausting or hopeless at that time, you could completely shut down the company, and no one would blame you, as it was a non-human incident. I struggled for a long time and felt helpless because when I called shareholders, I learned that they had significant losses and could not help. In such a short time, I had no legitimate reason to seek a large sum of money to restart.The incident occurred on Thursday night, and I was silent all day Friday, not daring to go to the company, feeling emotionally unwell and afraid my colleagues would see me. I couldn’t sleep over the weekend, and the next day I went hiking with two former colleagues who took me out for a day. They were very good friends from my overseas work, and they did not know what had happened. After the outing, I still couldn’t sleep, and I went three days and nights without closing my eyes.But in the end, I felt that I could not let this “great cause remain unfinished and die inexplicably”; I owed it to myself and the team to provide an explanation.Qiji: After making the decision to persist, what did you do?Nie Kaixuan: I had to find money. I reached out to two friends in Shenzhen, and fortunately, both were willing to lend me money. They also run businesses and understand that companies will face various risks. Everyone was willing to help each other, and because we knew each other well, they felt I was a good person.When I met the first friend, I mentioned that the company was out of money, and he asked how long it would take to produce a product. I said about six months, at a cost of one million per month. He said, “Okay, I will support you for six months.” But I was worried about personnel turnover or other impacts; if there were delays, I would definitely need more money. So I found a second friend and explained my company’s development situation, securing a second sum of money.That day, I experienced a significant emotional rollercoaster. When I left home in the morning, the sky was overcast, and I felt like the world was about to collapse. But after meeting them and securing the funds, the world became bright again. This was a very important turning point because going through such a painful process made me more determined about the motivation for doing this: whether it was to pursue my ideas and mission or simply to make money. Knowing what I wanted to do and being willing to invest and take certain risks is a very valuable trait that our team has gained from this experience.Qiji:They lent you a significant amount of money.Nie Kaixuan: So at that time, my thought was that there is true human kindness; the reputation one builds over time is crucial. The “cost of credibility” is the highest. How you treat people or your attitude towards work is something everyone can feel. This also served as a reminder for both me and the team that we must be very upright in our work; at critical moments, it can save our lives.

Advice for Young Technology Entrepreneurs

Qiji: What advice do you have for young technology entrepreneurs?Nie Kaixuan: While you are young, be a bit bolder and pursue what you want to do. Regardless of success or failure, it is a form of success because your experiences will not be wasted. I do regret starting my business a bit late; I should have done it earlier, as the cost of trial and error would have been much lower.Qiji (interview observer): What are your views on the research and development of graphics? As an incoming PhD student, how should one choose a direction in the field of graphics?Nie Kaixuan: There are two options in graphics: one is to continue pursuing height, making challenging breakthroughs on existing technologies to elevate to a higher dimension. The other is to leverage its advantages and find scenarios for application. The biggest application scenarios for graphics were originally games and film animation. However, we have successfully applied graphics, like NVIDIA Omniverse, to robots and the industrial track through ORCA.These two options involve continuing to pursue height, which may take some time. For example, Professor Chen from Peking University, Professor Bao Hujun from Zhejiang University, and Academician Hu Shimin from Tsinghua University are all working on “height-seeking” projects, aiming to break through some bottlenecks in graphics. In 2022, there was a conclusion in the industry that the future image market might be co-governed by neural networks represented by NERF and graphics. However, less than a year later, 3DGS emerged, and NERF became less appealing; technology changes rapidly. Nevertheless, graphics has remained unique for many years, maintaining its advantages. Other fields seem to be seeking breakthroughs and searching for paths. Therefore, you can consider combining these two perspectives.

Participants in this interview and content creation:

Interviewed by: Gao Tianhong (interview observer), Li Xingge, Shen Xiao

Written by: Liao Xinyi, Zhang Luocheng

Edited by: Shen Xiao

Formatted by: Wenwen

(End of the full text)Reflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosJoin the #Qiji Entrepreneurship Community#Scan the entrepreneurship camp QR code, submit the Qiji entrepreneurship camp application form, and you can join the community for 【free】. 【Swipe to view andreceive exclusive resources from the entrepreneurship community:Swipe left to see more resourcesReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial ScenariosReflections and Practices on Physical AI Simulation and the Application of Robots in Industrial Scenarios

Leave a Comment