Insights: GPU or ASIC, Which Will Drive the Large Language Model's Scalable Development?

Large Language Models (LLMs) are just getting started.

The CEOs of OpenAI, Anthropic, and xAI share remarkably similar visions—exponential growth in artificial intelligence will transform humanity, with impacts far exceeding most people’s expectations.

This is not just speculation. Today, the market and value of artificial intelligence have become a reality:

Human developers using GitHub CoPilot have seen a 55% increase in programming speed due to AI assistance.

GPT-4 scored in the 88th percentile on the LSAT, while the average human score is only in the 50th percentile.

By 2025, OpenAI’s LLM revenue is projected to reach approximately $10 billion, while Anthropic’s revenue will be between $2 billion and $4 billion.

Four years ago, GPT-2’s intelligence level was comparable to that of a preschool child. Now, GPT-4 resembles a smart high school student.

By around 2028, LLMs are expected to provide PhD-level intelligence. By the 2030s, LLM IQs will surpass that of humans.

The economic benefits of artificial intelligence are also continuously improving. The costs of specific models are decreasing by 4 to 10 times annually (Anthropic to OpenAI). This is a result of improvements in computational power and algorithms. This means that by 2030, the operational costs of today’s models will be only one-thousandth to one-ten-thousandth of current levels.

Artificial intelligence will be ubiquitous, and human productivity will soar with its assistance. For more details, please visit: https://situational-awareness.ai/from-gpt-4-to-agi/

Currently, more than five companies have the capability and capital to achieve this goal, including giants like Amazon, Google, and Microsoft. Startups like OpenAI and Anthropic have current valuations around $100 billion, and if they succeed, their valuations could reach $1 trillion. The winners of the LLM projects will become the first companies to surpass a $10 trillion market cap.

Their success will put immense pressure on growth and capacity in the semiconductor, packaging, data center, cooling, and power sectors. By 2030, semiconductor revenue will primarily come from AI/High-Performance Computing (AI/HPC).

Insights: GPU or ASIC, Which Will Drive the Large Language Model's Scalable Development?

GPU vs ASIC? The answer is clear: hyperscale enterprises need more options.

Currently, over 90% of AI accelerators in data centers use NVIDIA GPUs, some use AMD GPUs, and the rest use custom ASICs (primarily from Amazon).

NVIDIA is the only vendor providing a complete solution—covering GPUs, NVlink networks, racks, systems, and software. Competing with or defeating NVIDIA in its domain will be very challenging. NVIDIA’s annual revenue is $160 billion.

NVIDIA has three or four customers purchasing over 10% of its output, each spending nearly $20 billion annually.

However, AMD’s GPU roadmap is catching up to NVIDIA. Its M350 will match the Blackwell GPU in the second half of 2025. The M400 will align with NVIDIA’s anticipated Rubin (the successor to Blackwell). AMD is also catching up in software and interconnect/system areas, aiming for $10 billion in annual revenue by 2026.

Even if AMD does not surpass NVIDIA, major hyperscale computing firms are expected to bring them business. They want a strong alternative to NVIDIA that can provide some pricing advantages and can scale their data centers more quickly when NVIDIA’s supply is constrained.

What about ASICs for AI accelerators? Just a few years ago, ASICs were viewed negatively by investors—low margins, low growth. Now they are booming because hyperscale computing firms want more choices.

Amazon, Google, Meta, and OpenAI are all developing their own AI accelerators. There are also other vendors. For example, Broadcom’s AI revenue has skyrocketed about tenfold in three years, accounting for half of its total sales. Similarly, Marvell’s AI revenue has also surged during the same period, with AI now being its largest business segment.

At the Morgan Stanley Technology Conference in early March, OpenAI CEO Sam Altman stated that if some flexibility of GPUs is sacrificed, ASICs for specific models can be very efficient. Remember, networks used to use x86 processors. Now they all use switching chips because the types of data packets to be processed change slowly.

The market is shifting from training-centric to inference-centric. ASICs designed solely for inference can be simpler. The key lies in cost and power consumption. For instance, ASICs for inference of Transformer models can be simpler and cheaper. The CEO of Alchip stated that ASICs can be 40% more cost-effective than GPUs and can be optimized for specific customer software.

Insights: GPU or ASIC, Which Will Drive the Large Language Model's Scalable Development?

Today, AI accelerators typically feature 3nm or 2nm computing engines, and may also include standalone 5nm SRAM and PHY chips on “older,” cheaper nodes. The CEO of Alchip stated that the NRE costs for AI accelerators can reach up to $50 million. Broadcom/Marvell may be developing more complex accelerators, using more chipsets and 3D packaging, with development costs exceeding $100 million. Hyperscale computing platforms will have architecture teams of over 100 people; over 100 people responsible for network connections, and even more for software development. This brings total costs between $300 million to $500 million. Can they afford it?

If a hyperscale computing platform that spends $20 billion annually can gain a 10% discount due to alternatives to NVIDIA, it has the capability to develop its own ASICs. If a hyperscale computing platform can successfully create an ASIC that costs only half of NVIDIA’s GPUs and consumes less power, it would be a tremendous success. The CEO of Alchip stated that it could be about 40% cheaper than GPUs.

Hyperscale enterprises may deploy NVIDIA and AMD GPUs for the most complex and rapidly changing workloads and external customers, while using their own ASICs for internal workloads that change more slowly. The final combination of GPUs and ASICs will depend on relative performance, power consumption, and availability. It could be 90% GPUs and 10% ASICs. Or, as McKinsey predicts, it could be 10% GPUs and 90% ASICs. “Small customers” purchasing only $100 million annually will have to use GPUs.

Insights: GPU or ASIC, Which Will Drive the Large Language Model’s Scalable Development?

Leave a Comment Cancel reply

Related posts

Leave a Comment Cancel reply