Empowering Smart Logistics with Open Source IoT Big Data Platform

In the era of smart logistics, data plays a crucial role in both logistics equipment and logistics systems. The foundation of software systems like WMS and WCS is database software. The logistics equipment industry urgently needs database software systems that read faster, compute more efficiently, and are open source and modifiable. For universities, open source code will certainly play a significant role in cultivating algorithm talents. In this regard, TaoSi Data provides a great industry reference and a foundation for industry-academia-research cooperation.

Recently, the Ministry of Industry and Information Technology and eight other departments jointly issued the “Three-Year Action Plan for the Construction of New IoT Infrastructure (2021-2023)”, which clearly states that by the end of 2023, a preliminary IoT new infrastructure will be established in major cities across the country, with the number of IoT connections exceeding 2 billion.This number signifies a much larger data scale.Similarly, in the logistics field, with the industry’s development, both the scale of the Internet of Vehicles and the number of smart devices connected within various logistics centers are significantly increasing.Faced with the massive data characterized by strong timeliness and large real-time data volume, how to achieve efficient data storage and processing is crucial.

Based on this, various IoT data platforms covering functions such as data collection, storage, querying, analysis, and computation have begun to emerge continuously. Beijing TaoSi Data Technology Co., Ltd. (hereinafter referred to as “TaoSi Data”) is one of them. However, TaoSi Data, which does not follow conventional paths, not only abandons the traditional Hadoop ecosystem construction model but also launches a time-series database platform specifically designed for time-series spatial big data, breaking through traditional industry thinking with an open-source approach and pioneering a bold path of innovation, becoming a benchmark enterprise in the field of time-series data.

Empowering Smart Logistics with Open Source IoT Big Data Platform

The vibrant TaoSi Data team

Strategizing to Become a “Small Brick” in the IoT World

Meeting TaoSi Data’s founder, Tao Jianhui, for the first time happened right after a heavy snowfall. The sudden snowstorm wiped away Beijing’s colorful autumn in one night, but the clear blue sky and bare branches after the snow, juxtaposed with the uniquely designed buildings of Wangjing CBD where TaoSi Data is located, create an unusual beauty. Nature is always full of mysterious changes, and for Tao Jianhui, who comes from the software industry, change is the most familiar.

One of the fastest-changing industries is software, where the average lifespan of an app is only about ten months. The idea of creating a “long-lived” foundational software product has been growing in Tao Jianhui’s mind for a long time. Ultimately, two factors solidified his belief.

The first is the booming market demand. With the development of the internet, especially the sharp decline in communication costs, various types of data are collected and sent to the cloud, with data volumes exploding. “Ten years ago, it was hard to imagine that almost every vehicle and device continuously generates data; now, this is happening; and in the future, this trend will accelerate. In the industrial sector, various big data analysis technologies, especially artificial intelligence technologies, have created enormous commercial value from the collected big data, giving rise to an unprecedented market,” he stated.

The second is the relative technological lag. He further shared, “Compared to the rapid growth of data volume, the technology for data processing lags behind. Although there are already relatively complete big data processing frameworks in the market, including various free and open-source systems, they require a vast amount of storage space and computing resources. An operator alone needs thousands of servers just to store six months of browsing records, and continual expansion is necessary. Therefore, the growth of massive data poses greater challenges to technology and provides us tech geeks with a tremendous opportunity.”

How to make up for the existing technological deficiencies and fully meet the enormous market demand became the focus of Tao Jianhui’s next considerations. After researching the entire IoT and big data ecosystem, he found that general big data solutions typically assemble open-source big data software like Kafka, Redis, HBase, MongoDB, and Cassandra, using clusters to process massive data. However, due to the involvement of multiple systems, issues such as low development efficiency, poor operational efficiency, complex maintenance, and slow market application frequently occur. For industries like IoT and industrial internet, which have massive data collection, traditional general big data solutions are increasingly unsustainable. However, optimizing the storage structure can significantly enhance performance. The processing of massive time-series spatial data (from collection, storage, querying, computation to analysis) is undoubtedly a huge opportunity. Against this backdrop, TaoSi Data was officially established in June 2017, and subsequently, the IoT big data platform TDengine was born.

“As a foundational software, TDengine is like a small brick; no matter how the entire software world changes and iterates, it can still provide value years later, and I am very satisfied with that,” Tao Jianhui added with a smile. Although TaoSi Data was founded over four years ago, when recalling the company’s founding process, his face still carries the passion and vigor that seems characteristic of early-stage entrepreneurs.

Empowering Smart Logistics with Open Source IoT Big Data Platform

TaoSi Data has won numerous awards since its establishment in 2017

Leading the Way in Open Source Software in the Industry

TDengine is positioned as an IoT big data platform and a time-series data processing platform. Its core is to transparently combine real-time data and historical data operations, while also featuring caching, data subscription, stream computing, and message queue capabilities, providing a full-stack solution for IoT data processing.

Empowering Smart Logistics with Open Source IoT Big Data Platform

TDengine Time-Series Data Processing Platform

Time-series data refers to data with time labels, indicating data that changes in chronological order, or time-serialized data. Through research on IoT and industrial internet data, Tao Jianhui summarized ten characteristics of this type of data:

(1) All collected data is time-series;

(2) The data is structured;

(3) The data source of a collection point is unique;

(4) There are rarely updates or deletions of data;

(5) Data is generally deleted based on expiration dates;

(6) Data is primarily written, with reading as a secondary operation;

(7) Data flow is stable and can be calculated relatively accurately;

(8) Data undergoes real-time calculations such as statistics and aggregation;

(9) Data is always searched within a specified time period and region;

(10) The data volume is enormous, with daily data exceeding 10 billion entries.

In response to these characteristics, TDengine defined an innovative time-series data storage structure that significantly increases data insertion and reading speeds by over ten times compared to existing general databases through lock-free design and multi-core technology. Additionally, TaoSi Data has endowed TDengine with two core technological innovations: “one data collection point, one table” and “super table,” ensuring optimal insertion and query efficiency for TDengine while facilitating aggregate queries and multi-dimensional analysis.

Besides continuously improving product performance and cost-effectiveness, TDengine’s ability to form a broader impact also stems from Tao Jianhui’s bold decision—open source. This move is not surprising in the foreign software market but has undoubtedly made a strong impact in the relatively conservative domestic market. Since TDengine officially announced its open-source status in July 2019, and the cluster version in August 2020, its achievements have been impressive, garnering 17k Stars on the open-source community GitHub (the world’s largest code hosting platform). In the capital market, TaoSi Data has also attracted significant attention, having secured nearly $10 million in Pre-A round financing, over $10 million in A round financing, and $47 million in B round financing over the past two years.

In Tao Jianhui’s view, open source is the best shortcut for Chinese software to go global. Beyond the current achievements, he holds a firm goal in his heart—becoming the global leader in the field of time-series data. “In the foundational software field, whether it’s operating systems, databases, software development tools, or now big data processing platforms, it is almost entirely dominated by American companies. Having a seat at the table in foundational software is a dream for all IT professionals and a symbol of a country’s technological strength.” He expressed some regret about the current state of Chinese software but quickly regained confidence about future developments, stating, “China has the world’s largest data market, and the amount and variety of data collected have already surpassed that of the United States. Successful big data products in the Chinese market will certainly be accepted by the global market.”

Open source has not only brought immense success to TaoSi Data but also set a great example for the industry; it also has very positive implications for the cultivation of software talent in China.

Tao Jianhui stated that university students often only encounter source code when working on projects with their mentors, and various open-source software on open platforms are also very limited. However, with the open-sourcing of TDengine, students can freely access countless source codes, which undoubtedly presents the best learning opportunities and resources, positively contributing to the overall improvement of software standards in China. To promote university students’ understanding and recognition of open source, Tao Jianhui has shared TDengine’s core technologies and the thoughts behind open source at over 20 universities, including Tsinghua University, Fudan University, Chongqing University, Southwest University, Peking University, Renmin University of China, Beijing University of Posts and Telecommunications, and University of Science and Technology of China, encouraging students to actively participate in open-source initiatives.

It is worth mentioning that programmers are often the silent changers of people’s lives. With the open-sourcing of software, they have stepped out from behind the products and begun to interact and communicate with a broader audience. This mutual promotion not only further enhances their professional abilities but also allows them to create their own “business cards” through code, which is also beneficial for the overall improvement of the industry level. In Tao Jianhui’s eyes, these are more meaningful and valuable than corporate success.

Empowering Logistics and Accelerating Industry Digital Transformation

The processing of time-series data is the foundation of all digitalization and intelligence. Among the extensive service areas of TDengine, logistics is a very important part. Furthermore, with the rapid development of the logistics industry, upgrades in technologies such as autonomous driving, and increasingly widespread applications, the demand for time-series data processing is continuously expanding.

“Typical applications in the logistics field include real-time location and operating trajectory data of logistics vehicles, as well as data on logistics robots, shuttles, and other mobile equipment, including their locations, battery statuses, path planning, environmental monitoring, and trajectory tracking. Through TDengine, we can store massive amounts of data in smaller spaces while calling and analyzing the data needed by users in the shortest time, providing support for the realization of many other functions,” he explained regarding TDengine’s application in the logistics field. He further illustrated that a major domestic express delivery giant’s internet technology company turned to TDengine due to the poor performance of their previously used time-series database, OpenTSDB, which occupied too much storage space and did not adequately support queries for high-frequency terms over long spans. After migrating their big data monitoring platform to TDengine, the number of servers required dropped from 21 to 3. Additionally, TDengine has shown significant advantages in deployment, writing speed, query speed, storage efficiency, caching, and stream computing. Moreover, in industries such as tobacco, as business operations develop, the metrics that need to be monitored during production have increased from tens of thousands to hundreds of thousands or even millions. The application of TDengine helps enterprises improve data access efficiency, break traditional data silos, and enhance data utilization rates.

“Technology is the foundation of TaoSi Data. Our direction is to create value through technological innovation and meticulous research and development of exceptional products.” Throughout the conversation, his occasional hearty laughter revealed his immense confidence in the future development of TaoSi Data; his passionate demeanor and smiling face reflected his love for his work. “Generally speaking, the golden age for programmers is 25 to 35 years; a programmer like me who is still coding at 50 is a rare breed,” Tao Jianhui joked, “But I will continue to write code for a lifetime.” He stood by the large floor-to-ceiling window, the view outside broad and the sunlight just right.

Tao Jianhui’s Profile

Empowering Smart Logistics with Open Source IoT Big Data Platform

Founder of TaoSi Data.

Recipient of the 2020 China Open Source Outstanding Contributor Award.

Studied in the United States from 1994, and from 1997, worked at Motorola and 3Com in Chicago, engaging in wireless internet R&D. In early 2008, returned to Beijing to start HeXin, which was later acquired by MediaTek. In early 2013, founded Happy Mommy, which was later acquired by Pacific Network.

Founded TaoSi Data in May 2017, focusing on IoT big data processing. After the open-sourcing of the product TDengine, it ranked first on GitHub’s global trend list for several days. TaoSi Data has received nearly $70 million in investments from several institutions including Sequoia, GGV, Matrix Partners, and Ming Shi Capital.

Strategic Cooperation

Empowering Smart Logistics with Open Source IoT Big Data Platform

Empowering Smart Logistics with Open Source IoT Big Data Platform

Order MagazineEmpowering Smart Logistics with Open Source IoT Big Data PlatformLogistics Database420

Empowering Smart Logistics with Open Source IoT Big Data Platform

Leave a Comment