Understanding AI Chips: Distinguishing from Traditional Embedded Processors

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

Yesterday I saw an article titled “IEEE Association held a seminar in Beijing for the first time, Wang Feiyue said there are no AI chips,” and the title immediately caught my eye!

I couldn’t wait to open the article and found the key paragraph, which said:

Finally, when talking about the topic of “AI chips,” Professor Wang Feiyue bluntly stated, “I don’t think there is such a thing as an artificial intelligence chip now.” Professor Andreas Nuernberger of Magdeburg University in Germany added, “I agree that there is no such thing as an artificial intelligence chip. The current development of chips has accelerated the process of deep learning. There was also deep learning in images before; now you call them artificial intelligence chips, but they are products produced for specific purposes. I think the Internet of Things can make the process more reliable, ensuring that these networks and hardware respond faster, more reliably, and more dynamically. I think these are the realities of smart hardware, but they come at a high cost because you need more complex infrastructure and more technology, which is different from before.”

Interestingly, just a few days ago at the editorial meeting, we had just discussed “AI chips,” and Editor-in-Chief He Limin believed that “AI chips” are worth paying attention to! Regarding this issue, Professor He has his own views:

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

“AI chips” do exist; their concept is not only widely applied but cannot be replaced by other concepts such as MCU or MPU.

Wang Feiyue’s claim that AI chips do not exist may stem from the belief that AI chips are merely functional extensions of existing embedded processors. Little do they know that as the functions of embedded processors continue to extend, a qualitative change has occurred from quantitative change. Since artificial intelligence entered the deep learning era, the hardware acceleration of original MCUs can no longer meet the demands of high-speed massive numerical calculations and the requirements for cloud-based interactions with big data. This has led to the emergence of two fundamentally different groups of chips in the embedded field: MCUs and AI chips.The former meets the intelligent control needs of tools (focused on control); the latter meets the computational needs of intelligent machines for deep learning (focused on computation).In the future, in the field of artificial intelligence, two major areas will gradually form: intelligent tools and intelligent machines. Currently, the field of intelligent tools has matured, while the field of intelligent machines is gradually advancing toward strong artificial intelligence with the support of AI chips, neural networks, deep learning, and cloud-based interactions. At present, AI chips come in various forms and belong to the early development stage. Although some concepts are still debatable, the original concepts of various embedded processors can no longer be followed, and “AI chips” may become a conventional term.

Taking facial recognition as an example, real-time facial recognition used for access control may be achieved with an MCU + graphics accelerator solution. However, to identify a specific face in a crowd in real-time, deep learning must be introduced to continuously improve recognition capabilities; to compare with many faces, it must also interact with cloud-based big data. No matter how advanced an MCU is, it cannot bear such a heavy responsibility. Perhaps “deep learning” and “cloud interaction” are the two significant features of AI chips.

Currently, the competition in the AI chip field is fierce; it may be difficult to form a unified structural system in the short term, but gradually, as another newcomer in the embedded field, it will complement the MCU in the artificial intelligence field, each performing its role. They cannot replace each other and have different technological development directions. MCUs and AI processors are used in different fields, and both have enormous development potential.

Whenever a new concept emerges, it will inevitably go through a phase of skepticism; this is an objective law. Facing these two completely different viewpoints, how do you embedded friends view this issue? Feel free to leave a message~

If you haven’t seen the article titled “IEEE Association held a seminar in Beijing for the first time, Wang Feiyue said there are no AI chips,” just click the title to read!

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

Also attached is a report from the First Financial Daily in March: How much do people know about AI chips?

AI chips are currently a hot topic of interest in the technology industry and society, and they are also a crucial link in the development of AI technology. Regardless of how good the AI algorithms are, they must ultimately be implemented through chips.

AI chips face two real problems: first, we do not have an architecture that covers all algorithms, which requires implementing a deep learning engine in the chip to adapt to the introduction of algorithms; second, the architecture’s variability must have efficient architectural transformation capabilities. Currently, CPU plus software, CPU plus FPGA require us to explore innovations in architecture,” Wei Shaojun, head of the Department of Micro-Nano Electronics at Tsinghua University, stated during a public speech at the GTIC 2018 Global AI Chip Innovation Summit organized by Zhixiaoshuo.

“Diverse” AI chips

AI is a relatively broad concept. Although many consumer electronics manufacturers print AI-related terms on their promotional pages, some of them also realize that the development of AI products must go through several different stages and are therefore quite cautious.

The 352 air purification device has added a laser detection module to determine the PM2.5 pollution level in the environment while using a self-developed intelligent control algorithm to enable the air purification device to operate automatically based on the PM2.5 pollution level. However, in the statement of Zhang Yi, a partner at 352 Environmental Technology, this is called “to enhance the intelligent experience” without particularly emphasizing AI.

“The true form of intelligence is certainly not just the Internet of Things and remote control and voice input; these are still just some means and scattered manifestations. I believe the ultimate goal of intelligence is to reduce user intervention, gain insight into user psychology, make internal adjustments at any time, and enhance the product’s learning ability to allow the product to think and improve, gradually rising to emphasize human emotional needs, ultimately allowing people to meet their highest emotional needs through self-service of the product,” Zhang Yi told First Financial reporters.

To talk about AI chips, we must first define AI.

In the view of Chen Yingren, senior business development manager of Ladis Semiconductor Asia Pacific, “AI neural networks” are not simply defined as a type of product but rather a new design methodology. “Traditional algorithms are based on rules and logic, while neural networks are results trained with data.”

This is similar to traveling to a designated location; if you want to set some rules (logic) in advance, such as choosing a mode of transportation and planning transfer points, you will eventually arrive through traditional “rule-based” design. If you have known inputs and outputs, such as starting and ending points, and after having enough samples (data training), you can provide a new algorithm, which requires AI chips.

In the chip development process, there are both traditional old manufacturers and new technology companies. Will there be a general-purpose AI processor that exists independently like a general-purpose CPU?

In fact, various technical routes are exploring the balance between generality and optimization along different technological paths.

For example, Bitmain, which started with cryptocurrency digital chips, proposes to achieve a speed of iterating AI chips every nine months, which is a race against Moore’s Law of upgrading every 18 to 24 months and also a challenge to dark silicon using ASIC technology.

“Dark silicon” refers to the fact that due to power consumption limitations, only a small portion of the gates in a processor can work at any given moment, while most are inactive. This portion of inactive gates is completely ineffective at any given computational moment.

ASICs are integrated circuits designed for specific applications, which have advantages in power consumption, reliability, and size, especially in high-performance, low-power mobile devices. Bitmain is not alone on this path; there are also Google’s TPU leading the way and a number of startups following in vertical fields like machine vision.

“Compared to the iteration speed of traditional chips, AI algorithm iterations are faster. We quickly implement the latest algorithm requirements and the common foundation of neural network algorithms onto the chip,” said Tang Weiwei, product strategy director at Bitmain.

Bitmain launched its first AI chip in November 2017, which is now in full mass production, balancing training and inference functions but primarily focusing on inference. He believes that training and inference should be two different platforms, and Bitmain will continue to focus on inference in the future. “High-performance computing involves many fields, so we decided to enter the AI field, especially deep learning, at the end of 2015. Based on our existing high-performance computing chips, hardware, and some software algorithms, we also recruited many professionals in the AI field,” Tang Weiwei said.

However, Bitmain has not considered making terminal chips, and the chips provided will be used in servers.

Currently, the most widely used in the industry is GPUs because they are suitable for single-instruction, multiple-data processing and can be used for training and inference in deep learning. Simon See, CTO of NVIDIA AI Technology Center Asia Pacific, told First Financial in an interview that NVIDIA aims to make general-purpose chips. “Generality is our advantage, while ASICs are targeted at one field, and GPUs can be applied not only to AI training but also to image rendering and more.”

He stated that new algorithms are constantly emerging, and to adapt to the new algorithms, new chips need to be developed. NVIDIA will collect customer feedback and improve but will not adjust based on the so-called shift in trends, such as making chips specifically for mining. “Making chips is very risky; having so many companies involved is a good thing; perhaps new companies will produce excellent products. Our chip performance is reflected not only in the chip’s inherent performance (raw performance) but also in software performance,” Simon said.

Additionally, although less attention is paid by the public, FPGA is also expected to seize the opportunity for AI chip development. FPGAs are suitable for multiple instructions, single data flow analysis and are often used in the prediction phase because they have certain advantages in efficiency and power consumption due to the lack of memory and control for storage and reading, but the disadvantage is that the computation amount is not large.

“AI is a very good entry point for FPGA and also a chance for reshuffling. The parallel computing algorithms of FPGA and design are not easy to write because human logic is one-directional; it requires multi-angle consideration, which is not so easy, usually requiring special design methods,” Chen Yingren told First Financial.

In summary, finished chips can be divided based on whether they are programmable. CPUs, GPUs, and FPGAs are programmable, and different instructions can be given for different computations, while ASICs are non-programmable, customized chips. The difference can be roughly compared to buying ready-made clothes versus high-end customization. Ready-made clothes have relatively broad customers, while high-end customization is not easy to become a standard product if it cannot be modified.

Programmability means generality, while customization means optimization in certain aspects at the expense of others. Generality and optimization are oppositional, and chip manufacturers are all searching for the best balance point.

Application landing is the ultimate challenge

As there is still no AI chip suitable for all general algorithms, determining the application field has become an important premise for development. Unfortunately, the killer application for AI has yet to appear, and existing applications have not formed a rigid demand. Nevertheless, AI chips have emerged in a hundred schools of thought.

The field of machine vision has become a “battleground” for AI chips, with many startups like SenseTime, Megvii, and Horizon entering the fray. Yanqing Technology’s founder and CEO Zhu Jizhi is also one of them.

In solving practical problems, chips are not the only solution for Yanqing Technology; they provide a series of solutions from IP licensing, modules, chips to industry customization based on different industry situations. Yanqing Technology sees that the era of general chips represented by CPUs has monopolized the market has passed, and the AI industry has produced new demands, focusing on solving problems caused by weak light, reflection, and backlighting that lead to low image quality and algorithm recognition rates during the image collection phase.

Zhu Jizhi’s AI chip journey begins with innovative imaging technology architecture from the upstream chip field to meet the new needs of the AI market. “There are two types of video imaging technology: one is to provide images for backend analysis, like SenseTime Technology. How the images come is our business, which must be processed in real-time at the frontend without delays or errors, like an assembly line; the two technical routes are different,” Zhu Jizhi said.

Similarly, Fengshi Interactive, which outputs business application solutions based on artificial intelligence, focuses on the human-machine interaction field, providing gesture recognition, facial recognition, posture recognition, and various AI-based solutions. Fengshi Interactive CEO Liu Zhe told reporters, “Artificial intelligence will inevitably be segmented into industries, presenting a trend of diversified development. As technology matures, it will inevitably launch dedicated chips for specific human-machine interaction scenarios to reduce costs and power consumption while significantly improving performance.”

Building a more efficient and imaginative general AI ecosystem that allows humans and machines to communicate naturally in various environments is also attracting investors’ attention.

“Currently, there are two groups working on AI chips: one represented by Cambrian, which originally made chips and has experience in computer architecture and chip design, and the other represented by Horizon, which previously focused on software algorithms and is now making chips. The former is more likely to produce a usable and reliable product, while the latter tends to provide overall solutions, using software to compensate for hardware deficiencies,” said Chen Yu, managing director of Yunqi Capital, predicting that the two will have differentiated paths.

The high cost of chips is in the design and development stage. After designing, they must undergo expensive tape-out verification before mass production. Without large customers, it is impossible to share the upfront costs. Even if the research and development is successful, mass production faces upstream capacity constraints.

“Bitmain has rich experience in chip design; their mining chips have strong demand due to the explosion of the cryptocurrency market, but their capacity is still constrained by upstream chip foundries,” Chen Yu said. According to Tang Weiwei, Bitmain is expected to become TSMC’s fifth-largest customer globally this year.

Because of the high research and development cycles and costs of chips, Wang Gang, public relations director at Hard Egg, told First Financial that they will consider providing general AI modules in the future. “This year we saw the opportunity for AIOT, which is the combination of artificial intelligence and the Internet of Things. Hard Egg will connect upstream AI partners, such as Baidu and Yunzhisheng, with IoT projects on the Hard Egg platform to launch universal AI modules.”

Undoubtedly, the domestic semiconductor industry is thriving. There are reports that the national integrated circuit industry investment fund (referred to as the “big fund”) is currently raising funds for its second phase, with a fundraising scale exceeding the first phase, around 150 billion to 200 billion yuan. According to a leverage ratio of 1:3, the scale of social funds leveraged will be around 450 billion to 600 billion yuan.

President Ding Wenwu of the National Integrated Circuit Industry Investment Fund Co., Ltd. stated in an interview with China Electronics News last October that the original plan for the first phase was to raise 120 billion yuan, but through the efforts of all parties, the actual fundraising reached 138.72 billion yuan. After three years of operation, as of September 20, 2017, the big fund has cumulatively decided to invest in 55 projects involving 40 integrated circuit companies, committing a total of 100.3 billion yuan, accounting for 72% of the first phase’s raised funds, with actual contributions reaching 65.3 billion yuan, nearly half of the first phase’s raised funds.

“To truly make AI chips competitive, there must be a moat; this is far beyond the chip itself. Just like Alibaba and Tencent compete for entry traffic, chips are moving towards the application layer, better understanding the actual needs of end users, and better defining chips, which need to have strong energy efficiency and a certain AI processing architecture. Without such an architecture, everything is just empty talk,” said Yao Song, co-founder and CEO of Deep Insight Technology, with a clear understanding.

Meanwhile, Wei Shaojun does not shy away from saying, “The current development is too heated, and even the media has played a role in fanning the flames.” He mentioned that the development of AI chips is likely to encounter a setback period in the next two to three years. Today’s AI chips, which mainly meet specific applications, need to consider their future direction, and many of today’s entrepreneurs will become martyrs in this technological change.

If so, “without a doubt, this will be the most admirable and touching great practice in the development of AI,” Wei Shaojun said.

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

The editor also saw an article, which is pasted here for reference: TensorFlow member says: The future of deep learning lies in microcontrollers

Indeed, Pete Warden, the head of TensorFlow Mobile, is still focused on portable devices.

Pete Warden is a member of Google’s TensorFlow team and the head of TensorFlow Mobile, who has been roaming in the ocean of deep learning for years.

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

Additionally, these familiar books are also his works. Besides, Pete has a new idea to share with everyone —he firmly believes that in the future, deep learning can run freely on tiny, low-power chips. In other words, microcontrollers (MCUs) will one day become the most fertile ground for deep learning. The logic here is somewhat convoluted, but it seems to make some sense.

Why microcontrollers?

Microcontrollers are everywhere

According to Pete’s estimation, about 40 billion microcontrollers (MCUs) will be sold worldwide this year.

MCUs contain a small CPU with only several kB of RAM, but they are used in medical devices, automotive equipment, industrial devices, and consumer electronics.These computers require very little power and are also inexpensive, costing about less than 50 cents.The reason they do not receive much attention is that in general, MCUs are used to replace (like in washing machines and remote controls) those old electromechanical systems — the logic used to control machines has not changed much.

Energy consumption is the limiting factor

Any device that requires mains electricity has significant limitations. After all, no matter where you go, you need to find a place to plug in, and even mobile phones and PCs need to be charged regularly.

However, for smart products, being usable anywhere without frequent maintenance is the key.So, let’s take a look at how quickly various components of smartphones consume power —

· Display 400 mW

· Radio 800 mW

· Bluetooth 100 mW

· Accelerometer 21 mW

· Gyroscope 130 mW

· GPS 176 mW

In contrast, an MCU needs only 1 mW or even less. However, a button battery has 2,000 joules of energy, so even a 1 mW device can only last for a month.Of course, most devices now use duty cycling to avoid keeping every component in a working state all the time. However, even so, power distribution is still quite tight.

CPUs and sensors are not very power-hungry

CPU and sensor power consumption can be reduced to the microwatt level, like Qualcomm’s Glance vision chip.

In contrast, displays and radios consume significantly more power. Even Wi-Fi and Bluetooth require at least several tens of milliwatts.

Because the energy required for data transmission seems to be proportional to the transmission distance. CPUs and sensors only transmit a few millimeters, while radio transmission distances are measured in meters, which costs much more.

Where does the sensor data go?

Sensors can acquire much more data than people can use.Pete once talked to a developer of micro-satellite imaging.

They basically use smartphone cameras to shoot high-definition videos. But the problem is that satellites have limited data storage and very limited transmission bandwidth, so they can only download a tiny bit of data from Earth every hour.

Even without involving extraterrestrial matters, many sensors on Earth face similar awkwardness.

A very interesting example comes from Pete’s good friend, who finds that every December, his home internet traffic explodes. Later, he discovered that the colorful lights hung for Christmas affected the video download compression ratio, causing many frames to be downloaded.

What does this have to do with deep learning?

If the above sounds a bit reasonable, then there is a vast market waiting for technology to tap into.What we need is a solution that can run on microcontrollers, consuming very little power, relying on computation without depending on radio, and utilizing the sensor data that would otherwise be wasted.This is also the gap that machine learning, especially deep learning, needs to cross.

A perfect match

Deep learning is computation-intensive, and it can run comfortably on existing MCUs.This is important because many other applications are limited by the constraint of how quickly they can obtain a large amount of storage space.

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

In contrast, most of the time, neural networks are used to multiply large matrices together, flipping back and forth using the same numbers, just in different combinations.This kind of computation is certainly much lower in carbon emissions than reading large amounts of values from DRAM.If the required data is not that much, SRAM, a low-power device, can be used for storage.In this sense, deep learning is most suitable for MCUs, especially when 8-bit calculations can replace floating-point calculations.

Deep learning is very low-carbon

Pete has spent a lot of time considering how many picajoules each computation requires.For example, the simplest structure of the MobileNetV2 image classification network requires about 22 million operations.If each operation requires 5 picajoules, at one frame per second, the power consumption of this network would be 110 microwatts, allowing it to last nearly a year on a button battery.

Friendly to sensors as well

In recent years, people have used neural networks to process noise signals, such as images, audio, accelerometer data, etc.

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

If neural networks can run on MCUs, then a larger amount of sensor data can be processed instead of wasted.At that time, whether it is voice interaction or image recognition functions, they will become lighter.

Although this is still just an ideal.

Original article link:https://petewarden.com/2018/06/11/why-the-future-of-machine-learning-is-tiny/

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

1. The 6th issue of 2018 of the “Microcontroller and Embedded System Applications” electronic journal is freshly out!

2. Still looking for IAR materials on the forum? Check out IAR’s free training!

3. In the era of artificial intelligence, embedded and IoT engineers will play a core technical role.

4. A very comprehensive component selection guide.

5. How to choose a suitable embedded operating system?

6. The top ten machine learning algorithms commonly used by data scientists are all here!

Understanding AI Chips: Distinguishing from Traditional Embedded Processors

Leave a Comment

×