The Next Breakthrough in Machine Learning: TinyML

↑↑↑ Click on the blue text above, reply with materials, and get a surprise of 10GB

Author: Mu Yang Source: Huazhang Computer (hzbook_jsj)

Introduction: Today, we introduce a brand new version of machine learning that you may not have tried before, called TinyML. ML is something we are all familiar with, while TinyML is the abbreviation for Tiny Machine Learning, literally translated as “micro machine learning”, which is a machine learning technology aimed at specific embedded products. What specific embedded products are we talking about?

The Next Wave of Machine Learning Revolution

Let’s talk about the term TinyML first; perhaps it sounds unfamiliar to everyone. In China, the understanding of TinyML is somewhat lagging, with only sporadic introductions available. However, in the industry, TinyML has already been regarded as a specialized term with specific meaning, rather than just a conceptual gimmick proposed by a book or a company.

I specifically investigated this area and found that the industrial development of TinyML is already vibrant. In the industry, a dedicated industry committee for TinyML has been established, and TinyML summits are held regularly. At this year’s TinyML summit, big companies like NVIDIA, ARM, Qualcomm, Google, Microsoft, and Samsung all showcased their presence, demonstrating the scale and influence of the industry. In academia, a classic book on TinyML has already been published, and some well-known universities have even started offering related courses.

However, this raises a question. We all know about machine learning, but how do we define what counts as “Tiny“? Also, why create a separate category of TinyML in an already flourishing field of machine learning? This brings us to the pain points of deep learning.

The “Big” Problem of Deep Learning

Although TinyML carries the name “ML“, it primarily utilizes DL technology, which focuses on deep learning models. We can let this slide for now; after all, deep learning is currently in vogue, but simply calling it DL seems a bit narrow. Who knows, a different ML algorithm might emerge tomorrow and turn the tables?

In short, the current star of TinyML is indeed the deep learning model, but here comes the twist: deep learning models have a significant problem that TinyML cannot directly use; they must undergo some processing first.

So, what is the major problem with deep learning?

The problem is that deep learning models use a neural network structure, which can be very large and is getting larger. Initially, the large scale of deep learning models wasn’t a problem; in fact, it was an advantage. Recall that after deep learning gained popularity, a common question was: why is deep learning more effective than other machine learning models?

This is a good question. Many beginners tend to assume that machine learning, like smartphones, must improve with each new version, and since deep learning is the latest development in machine learning technology, it should naturally outperform other models.

This contains several misunderstandings. First, deep learning is not “new”. Of course, deep learning is indeed a new technology that has risen in recent years, but it did not just appear out of nowhere. There are several classic algorithms in the field of machine learning, and neural networks are one of them. To say that deep learning is merely an evolution of neural networks is overly simplistic; deep learning is essentially a deeper version of neural networks with a new facade. In my view, it’s like an old tree blossoming anew.

Which model is better or worse is a topic of much discussion, but this question has long been answered. There is a famous theorem in machine learning known as theNFL theorem (No Free Lunch Theorem), which is generally translated as the no free lunch theorem in Chinese. The NFL theorem involves a rather complex mathematical proof, but simply put, it gives us a very clear conclusion: the effectiveness of a machine learning model is not determined by the model itself, but by the specific data circumstances. In simpler terms, models themselves are neither good nor bad.

Given this, what is the advantage of deep learning, and why has it surged in recent years?

The most common explanation is that hardware has developed significantly in recent years, and high-performance hardware has become cheaper. While this answer is not incorrect, it clearly overlooks the main character. High-performance hardware has become cheaper, which benefits all algorithms equally; so why has only deep learning risen to prominence? The answer is simple: deep learning models have a very favorable characteristic that is now apparent: the relationship between model scale and performance has remained positively correlated. As the scale of deep learning models continues to increase, their predictive performance can also be enhanced simultaneously, in simple terms, they are positively correlated. In contrast, other machine learning models do not possess this characteristic, making it easy for them to hit performance bottlenecks, requiring significant effort to find alternative ways to improve.

Why is this characteristic considered favorable now? Because the larger the model, the greater the computational load, which requires exponentially increasing computational resources. Previously, hardware costs were high, and computational power was not abundant; even buying a cup of autumn milk tea had to be calculated carefully. Under such conditions, deep learning, or more accurately, the neural network model at that time, did not exhibit much advantage. However, times have changed; now hardware is cheap, and computational power is practically unlimited, making deep learning’s approach of “the bigger the model, the better the results” very appealing, so the aesthetic of “great efforts yield miracles” has become the mainstream research direction in academia today.

However, this overwhelming advantage of deep learning becomes a significant disadvantage in the realm of TinyML.

The World of Tiny

As mentioned earlier, TinyML is aimed at specific embedded products. We generally have an impression of embedded products: they are very sensitive to power consumption. In TinyML, the power consumption requirement becomes even stricter: it must be below 1 milliwatt.

This is a very stringent requirement; various powerful GPUs that are essential for running deep models consume hundreds of watts, which disqualifies them immediately. Even many embedded devices known for their energy efficiency, such as the Raspberry Pi, exceed this limit significantly. The Raspberry Pi is known as a geek’s toy, characterized by its low power consumption. To what extent? It can be negligible, which is why many people leave it on indefinitely unless a fault occurs or there is a power outage. However, even devices like the Raspberry Pi have power consumption rates of several hundred milliwatts, far exceeding the standards of TinyML.

Why does TinyML set such stringent standards? This is closely related to the application scenarios of TinyML, which have two main characteristics: first, there are tasks that require intelligent technology to accomplish, such as image recognition and audio recognition; second, they need to remain in standby for long periods, necessitating very low power consumption. Many researchers believe that these two characteristics will become the basic requirements for various devices in the future, given the current development of smart devices, IoT devices, and embedded devices.

A typical example is the wake-up process of smart devices. There are many smart assistants available now, with Apple’s Siri being the most well-known. These smart assistants remain in sleep mode but activate immediately upon hearing specific wake-up words, such as “Hey Siri“. This wake-up process of the smart assistant is a typical application scenario for TinyML, which generally involves three steps: first, activating the audio recognition task of the deep learning network; second, continuously capturing ambient audio for the model; and third, once the wake-up word is recognized, activating the smart assistant.

What devices are typically used for TinyML? You might instinctively think of the Raspberry Pi when you hear “embedded” and assume that TinyML is a technology for developing machine learning on Raspberry Pi, but that’s not the case. The Raspberry Pi exceeds the power consumption limit and cannot be used to develop TinyML. So what can be used to develop TinyML? Generally, ultra-low-power microcontrollers are used, which is where the term “Tiny” comes from. Microcontrollers can only perform computations, while some peripheral functions, such as audio capture, require specific sensor components to complete, so development boards are typically used with microcontrollers, often opting for another popular embedded product known as Arduino.

Deep Learning Under TinyML

The most challenging yet fascinating part is here: how can we deploy deep learning models on microcontrollers with less than 1 milliwatt power consumption?

First, let’s introduce the general process for deploying deep learning models. To use a deep learning model, the first step is naturally to train the model, which actually involves three major tasks: collecting data, building the model, and training the model using the data. These three tasks are crucial steps in both machine learning and deep learning, and the majority of the time is spent completing these tasks, which are extensive enough to fill a book each and are popular research directions.

However, for TinyML, although these three tasks are important, they are considered routine; the main focus is on what comes next. But here lies a problem. Collecting data and building the model are straightforward, but using data to train the model is a labor-intensive task, especially for deep learning models. Earlier, we mentioned that deep learning models have a favorable characteristic: the scale and performance of the models are positively correlated, but this also leads to a problem: once the scale of the model increases, the computational load naturally increases. Although hardware is now cheaper and computational power is practically unlimited, it still requires power to drive this computational capacity. This chain of events results in deep learning models becoming notorious for their high power consumption, such as the GPT-3 model, which has training costs reaching up to $13 million. Meanwhile, the power consumption requirement of TinyML is limited to less than 1 milliwatt, placing strict limits not only on energy consumption but also on the necessary hardware resources, such as memory. Therefore, even smaller deep learning models are considered massive for the tiny footprint of TinyML, making it impossible to train models on devices with such low power consumption.

So what can be done? The approach taken by TinyML is separation: the heavy lifting of model training is still performed on GPUs, where the 1 milliwatt constraint does not apply, allowing for unrestricted processing; as long as the trained model can be provided at the end, that’s sufficient. Thus, for TinyML, the first step is loading the model.

Loading the model itself is simple; in code, it can be completed in just a few lines, but the preparation work may be more extensive. First, our model is trained on a general platform, while it needs to run on a microcontroller, meaning the operating environments differ. Currently, the most well-known tool in the industry for embedded deep learning is Google’s TensorFlow Lite, which is the embedded version of the deep learning framework TensorFlow, one of the two major camps. TensorFlow Lite is specifically optimized for embedded platforms. However, fully supporting TensorFlow Lite on microcontrollers is still challenging, so Google has released TensorFlow Lite for Microcontrollers, which is the microcontroller version of TensorFlow Lite.

To use TensorFlow Lite for Microcontrollers to load a model trained with TensorFlow, one must consider the differences between platforms, such as the need for a complete set of operators to run the model, while TensorFlow Lite only supports some of TensorFlow’s operators. This may require additional processing to ensure the model can run correctly.

After loading the model, the next steps follow a similar process to what we discussed earlier: connecting the microcontroller and sensors, passing external data to the model, and finally outputting results. However, due to the limited hardware resources of TinyML devices, especially memory, additional adjustments may also be necessary.

TinyML is likely the next wave of breakthroughs in machine learning; of course, there are many technical details involved. To gain a more systematic understanding of this content, I recommend reading “TinyML: Deploying Machine Learning on Arduino and Ultra-Low-Power Microcontrollers Based on TensorFlow Lite”. This is currently the only book I’ve seen that specifically covers TinyML, and it’s a new release, with the English version published in 2019 and the Chinese version following in 2020. The publisher, Jiguang Huazhang, has been very quick to respond, and I give them a thumbs up! This book is the most renowned in the field of TinyML and is highly authoritative. The author, Pete Warden, is a core member of the TensorFlow Lite development team and offers many insightful perspectives on TinyML and how to develop TinyML products. The book also progresses from simple to complex and includes many practical cases, making it particularly suitable as an introductory book for understanding TinyML and entering this field.

Related posts

Leave a Comment Cancel reply