Artificial Intelligence Chips: Concepts and Importance

Artificial Intelligence Chips: Concepts and Importance

In the foreseeable future, artificial intelligence will play a significant role in national and international security. Therefore, the U.S. government is considering how to control the dissemination of information and technology related to artificial intelligence. Due to the difficulty in effectively managing general artificial intelligence software, datasets, and algorithms, the computer hardware required for modern intelligent systems naturally becomes the focus of attention. Leading, specialized “artificial intelligence chips” are crucial for the economic and efficient large-scale application of artificial intelligence. In this regard, the Center for Security and Emerging Technology (CSET) at Georgetown University has released a report titled “Artificial Intelligence Chips: Concepts and Importance,” which highlights what artificial intelligence chips are, why they are indispensable for the large-scale development and deployment of artificial intelligence, and analyzes the impact of artificial intelligence on national competitiveness.

Artificial Intelligence Chips: Concepts and Importance

1. Industry Development Favors Artificial Intelligence Chips Over General Chips

General Chips

(1) The Law of Chip Innovation

All computer chips, including general Central Processing Units (CPUs) and specialized chips (such as artificial intelligence chips), benefit from smaller transistors, which run faster and consume less power compared to larger transistors. However, at least in the first decade of the 21st century, despite the rapid shrinkage of transistor sizes leading to significant speed and efficiency improvements, the design value of specialized chips remained low, with general CPUs dominating the market.

As the technology for shrinking transistors continues to advance, the density of transistors in chips continues to increase. In the 1960s, Moore’s Law stated that the number of transistors in a chip would double approximately every two years. Following this law, CPU speeds have greatly improved. The increase in transistor density enhances speed primarily through “frequency scaling,” allowing transistors to switch between on (1) and off (0) states more quickly, enabling a given execution unit to perform more calculations per second. Additionally, the reduction in transistor size decreases the power consumption of each transistor, significantly improving the efficiency of the chip.

With the shrinking and increasing density of transistors, new chip designs become possible, further enhancing the efficiency and speed of chip operations. CPUs can integrate more different types of execution units, which can be optimized for different functions. Meanwhile, more on-chip memory can reduce the demand for off-chip memory, improving access speeds. Furthermore, CPUs can provide more space for architectures that enable parallel rather than serial computation. Relatedly, if the increase in transistor density makes CPUs smaller, then a single device can accommodate multiple CPUs, enabling the simultaneous execution of different computations.

(2) The Slowing of Moore’s Law and the Decline of General Chips

As transistors shrink to the size of just a few atoms, their size is rapidly approaching an absolute lower limit, and various physical problems at such small sizes make further shrinking of transistor sizes technically more challenging. This has led to unsustainable growth in capital expenditures and talent costs in the semiconductor industry, with the introduction of new chip process technology nodes occurring at a slower pace than in the past. Therefore, Moore’s Law is slowing down, meaning that the time required for transistor density to double is increasing.

During the era dominated by general chips, costs could be spread across millions of chips sold. However, while specialized chips have achieved improvements for specific tasks, they cannot rely on sufficient sales to offset high design costs, and their computational advantages are quickly erased by the next generation of CPUs. Today, the slowing of Moore’s Law means that CPUs are no longer rapidly improving, and the economies of scale associated with general chips are disrupted. Meanwhile, on one hand, key improvements in semiconductor capabilities have shifted from manufacturing-driven to design and software-driven; on the other hand, the growing demand for artificial intelligence applications requires highly parallel and predictable computations reliant on specialized chips.

These factors drive chips towards specialization in artificial intelligence, prompting artificial intelligence chips to capture market share from CPUs.

Artificial Intelligence Chips: Concepts and Importance

2. Main Characteristics of Artificial Intelligence Chips

Artificial intelligence chips are a common type of specialized chip with several shared characteristics. First, compared to CPUs, artificial intelligence chips can perform more computations in parallel; second, they can successfully implement artificial intelligence algorithms using low-precision computation modes while reducing the number of transistors required for the same computation; third, they accelerate memory access by storing entire algorithms on a single artificial intelligence chip; fourth, they use specialized programming languages to effectively translate artificial intelligence computer code for execution on artificial intelligence chips. It should be clarified that artificial intelligence chips are a specific type of computer chip that can efficiently and rapidly perform artificial intelligence computations, at the cost of operating with lower efficiency and speed in other general computations.

Artificial intelligence chips can be categorized into three types: Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), and Application-Specific Integrated Circuits (ASICs). GPUs were originally used for image processing. Since 2012, GPUs have increasingly been used for training artificial intelligence systems, a trend that has dominated since 2017. GPUs are sometimes also used for inference. However, while GPUs can offer a higher degree of parallelism than CPUs, they are still designed for general computing. In contrast, specialized FPGAs and ASICs are more efficient, with applications in inference becoming increasingly prominent, and ASICs are also being used more frequently for training. FPGAs consist of many logic blocks (i.e., modules containing a set of transistors) that can be reconfigured by programmers after chip manufacturing to accommodate specific algorithms, while ASICs include hardwired circuits customized for specific algorithms. Cutting-edge ASICs often provide higher efficiency than FPGAs, while FPGAs offer greater customization capabilities, enabling design optimization as algorithms evolve. In comparison, ASICs can only become increasingly outdated as algorithms iterate.

Machine learning is a crucial method for achieving artificial intelligence, primarily involving training and inference. Simply put, training is the stage of searching for and solving optimal parameters for models. Once model parameters are solved, using and deploying the model is referred to as inference. Considering that training and inference tasks have different requirements for chips, different artificial intelligence chips may be used for both. First, training and inference require different forms of data parallelism and model parallelism, with training needing additional computation steps based on some common computational steps. Second, training always benefits from data parallelism, but inference may not, as it may sometimes only require performing inference on a single data block. Finally, depending on the application scenario, the relative importance of efficiency and speed for training and inference may vary.

The commercialization of artificial intelligence chips depends on the extent of their general capabilities. GPUs have long been widely commercialized, while FPGAs have a lower degree of commercialization. Meanwhile, ASICs have high design costs, and their specialized characteristics lead to low sales, making commercialization challenging. However, the anticipated growth in the market size for artificial intelligence chips may create the necessary economies of scale, allowing narrowly applicable ASICs to become profitable.

Artificial intelligence chips can be categorized into different levels based on performance. High-performance artificial intelligence chips are typically used in high-performance data centers and are larger than other artificial intelligence chips after packaging. Mid-performance chips are commonly used in personal computer artificial intelligence chips. In the low-performance category, mobile artificial intelligence chips are typically used for inference and are integrated into a chip system that also contains a CPU.

Artificial Intelligence Chips: Concepts and Importance

3. Why Artificial Intelligence Needs Cutting-Edge Artificial Intelligence Chips

The efficiency and speed of artificial intelligence chips are typically 10 to 1000 times higher than that of CPUs. An artificial intelligence chip that is 1000 times more efficient than a CPU provides improvements equivalent to 26 years of CPU advancements driven by Moore’s Law.

(1) Analyzing the Cost-Benefit Perspective of Using Cutting-Edge Artificial Intelligence Chips

Leading-edge artificial intelligence systems require not only artificial intelligence chips but also the most advanced artificial intelligence chips. General chips are larger, run slower, and consume more energy, leading to rapidly escalating power costs that become unbearable during the training of artificial intelligence models.

By comparing the costs of cutting-edge artificial intelligence chips (7nm or 5nm) with those of general chips (90nm or 65nm), two main conclusions can be drawn. In terms of production and operational costs, using cutting-edge artificial intelligence chips will save more economic costs than general chips. This is because the electricity costs incurred by general chips after two years of use will be 3 to 4 times the cost of the chips themselves, and these costs will continue to increase year by year. In contrast, the electricity costs incurred by cutting-edge artificial intelligence chips will just exceed the cost of the chips themselves. Secondly, it is estimated that the cost of producing and operating 5nm chips will take 8.8 years to match the cost of 7nm chips. Therefore, for periods shorter than 8.8 years, 7nm chips are cheaper, while for periods longer than 8.8 years, 5nm chips become cheaper. Thus, users are only incentivized to replace their existing 7nm chips with 5nm chips if they anticipate using the latter for 8.8 years.

Generally, companies tend to replace server-grade chips after about three years of operation, but if they purchase 5nm chips, they may expect a longer usage period, so the slowdown in market demand also aligns with the slowing trend of Moore’s Law. This leads to the prediction that 3nm chips may not be released for a long time.

(2) Chip Costs and Speed as Bottlenecks for Compute-Intensive Artificial Intelligence Algorithms

The time and money that companies spend on artificial intelligence-related computations have become bottlenecks for technological progress. Given that cutting-edge artificial intelligence chips are more cost-effective and faster than older chips or cutting-edge CPUs, artificial intelligence companies or laboratories need such chips to continue advancing intelligent technology.

First, DeepMind has developed a series of leading artificial intelligence applications (such as AlphaGo), with some training costs reaching as high as $100 million. OpenAI reported that its total costs in 2017 were $28 million, of which $8 million was for cloud computing. If computations were run using older artificial intelligence chips or cutting-edge CPUs, the computation costs would multiply by 30 or more, making such artificial intelligence training or experiments economically unfeasible. The rapid growth of computation costs may soon reach a ceiling, thus necessitating the most efficient artificial intelligence chips.

Second, leading artificial intelligence experiments may require several days or even months of training time, while deployed critical artificial intelligence systems often require fast or real-time inference. Using older artificial intelligence chips or cutting-edge CPUs would greatly increase these times, making the iteration speed required for artificial intelligence research and the inference speed of deployed critical artificial intelligence systems unacceptably slow.

A limitation of the above analysis is that some recent breakthroughs in artificial intelligence do not require significant computational power. Additionally, researchers are developing artificial intelligence algorithms that require minimal training. For these algorithms, cost or speed may not become bottlenecks.

Artificial Intelligence Chips: Concepts and Importance

4. Conclusion

Cutting-edge artificial intelligence chips are a crucial engine driving the rapid development of artificial intelligence. The United States and its allies have certain competitive advantages in the semiconductor industry. Among them, American companies have significant advantages in artificial intelligence chip design, including EDA software for chip design. Companies from the U.S., Taiwan, and South Korea control the vast majority of chip manufacturing plants (“fabs”), which have sufficient capacity to manufacture cutting-edge artificial intelligence chips. Companies from the U.S., the Netherlands, and Japan jointly control the market for semiconductor manufacturing equipment used in fabs. However, as China advances in the cutting-edge chip industry, the aforementioned advantages of the U.S. and its allies may diminish.

Leave a Comment