LLM: GPU or ASIC?

Apr.

Click the blue text to follow us

2025.04

Source: ContentTranslated fromsemiengineering, thank you.

The CEOs of OpenAI, Anthropic, and xAI share a remarkably similar vision—advancements in artificial intelligence are exponential, and they will transform humanity, with impacts that will exceed most people’s expectations.

This is not just speculation. The market for artificial intelligence and its value are now a reality:

1. Human developers using GitHub CoPilot can increase their coding speed by 55% with the help of AI.

2. GPT-4 scored 88% on the LSAT, while the average human score is 50%.

3. Personally, I am using ChatGPT for Spanish conversation practice and grammar exercises.

By 2025, OpenAI’s revenue from large models is expected to reach approximately $10 billion, while Anthropic’s large model revenue will be between $2 billion and $4 billion.

Four years ago, GPT-2 exhibited the intelligence of a preschool child. GPT-4 resembles a smart high school student.

By around 2028, large models (LLMs) are expected to provide doctoral-level intelligence. By the 2030s, the IQ of a law master’s program will surpass that of humans.

The economic benefits of artificial intelligence are also continuously improving. The costs of specific models are decreasing by 4 to 10 times per year (Anthropic to OpenAI). This is a result of the combined effects of computational power and algorithm improvements. This means that by 2030, the operational costs of today’s models will drop to one-thousandth to one-ten-thousandth.

There are more than five companies capable and capitalized to achieve this, including giants like Amazon, Google, and Microsoft. Startups like OpenAI and Anthropic, currently valued at around $100 billion, could reach a valuation of $1 trillion if they meet their goals. The winners of the large model projects will become the first company to reach a $10 trillion market cap.

Their success will put immense pressure on growth and capacity in the semiconductor, packaging, data center, cooling, and power sectors. By 2030, semiconductor revenue will primarily come from AI/high-performance computing.

GPU vs ASIC? Yes: Hyperscale companies need more options

Today, over 90% of AI accelerators in data centers are NVIDIA GPUs, with some AMD GPUs, and the rest are custom ASICs (mainly from Amazon).

NVIDIA is the only vendor offering a complete solution—covering GPUs, NVlink networks, racks, systems, and software. Competing with or even defeating NVIDIA in this field is very challenging. The company’s annual revenue reaches as high as $160 billion.

NVIDIA has 3 or 4 customers that purchase over 10% of its output, with each customer buying nearly $20 billion annually.

However, AMD’s GPU roadmap is catching up to NVIDIA. Its M350 will match the Blackwell architecture in the second half of 2025. Its M400 will align with NVIDIA’s anticipated Rubin architecture (the successor to Blackwell). AMD is also catching up in software and interconnect/system areas, aiming for $10 billion in annual revenue by 2026.

Even if AMD is not as strong as NVIDIA, large hyperscale computing vendors are expected to provide business for them. They are looking for strong alternatives to NVIDIA that can offer some pricing advantages and can quickly scale their data centers when NVIDIA’s supply is constrained.

What about ASICs for AI accelerators? Just a few years ago, ASICs were viewed negatively by investors—low profit, low growth. Now, they are in demand because hyperscale companies want more options.

Amazon, Google, Meta, and OpenAI are all developing their own AI accelerators. Of course, other companies are also actively positioning themselves. For example, Broadcom’s AI revenue has skyrocketed about 10 times in three years, accounting for about half of total sales. Similarly, Marvell’s AI revenue has also surged during the same period, with AI now being its largest business segment.

Click follow and star to connect with non-network eefocus

Scan the code and reply with “GPU, AI“ keywords for industry groups and to receive reports

At the Morgan Stanley Technology Conference in early March, OpenAI CEO Sam Altman stated that if some flexibility is sacrificed, ASICs for specific models can be very efficient. Don’t forget, networks used to use x86 processors. Now they all use switching chips because the types of data packets to be processed change slowly.

The market is shifting from training-centric to inference-centric. ASICs dedicated solely to inference can be much simpler. The key lies in cost and power consumption. ASICs for inference limited to, for example, Transformer models can be simpler and cheaper. The CEO of Alchip stated that ASICs can be 40% more cost-effective than GPUs and can be optimized for specific customer software.

Today’s AI accelerators typically feature 3nm or 2nm computing engines, and may also include standalone 5nm SRAM and PHY chips on “older,” cheaper nodes. The CEO of Alchip mentioned that the NRE cost for AI accelerators can reach up to $50 million. Broadcom/Marvell may be developing more complex accelerators, using more chipsets and 3D packaging, with development costs exceeding $100 million. Hyperscale computing vendors will have architecture teams of over 100 people; over 100 people will be responsible for network connections, and even more will handle software development. This means total costs will range from $300 million to $500 million. Can they afford it?

If a hyperscale vendor purchases $20 billion annually, they can afford to develop ASICs independently due to other options, potentially receiving a 10% discount from NVIDIA. If a hyperscale vendor successfully creates an ASIC that costs only half as much as an NVIDIA GPU and consumes less power, they will achieve significant success. The CEO of Alchip stated that this ASIC could be about 40% cheaper than a GPU.

Hyperscale data center operators may deploy NVIDIA and AMD GPUs to handle the most complex and rapidly changing workloads, as well as external customers, while using their own ASICs for internal workloads that change more slowly. The final combination of GPUs and ASICs will depend on relative performance, power consumption, and availability. It could be 90% GPU and 10% ASIC. Or, as McKinsey predicts, it could be 10% GPU and 90% ASIC. Smaller customers spending only $1 billion annually will have to use GPUs.

Reference link

https://semiengineering.com/gpu-or-asic-for-llm-scale-up/

END

Welcome to leave comments and exchange in the comment area!

Industry Community

Related posts

Leave a Comment Cancel reply