The AI Chip War: How Long Can NVIDIA's Moat Last?

Let’s start with the first article, focusing on NVIDIA, the highest-valued company in the world.

Starting at the end of 2024, NVIDIA’s latest GB200 chip will cost over $70,000 per unit. This is just the price of the chip itself—if you want to buy a complete GB200 system with eight chips and interconnect devices, the total price approaches $3 million.

An AI server costs more than a Ferrari.

Even crazier, even if you are willing to pay this price, you may not be able to get one. Tech giants like Microsoft, Google, Meta, and Amazon have already queued up to place orders, with Microsoft alone ordering between 700,000 to 1.4 million Blackwell systems. Due to production capacity issues, the GB200’s mass shipment has been delayed from the originally planned Q3 or Q4 of 2024 to early 2025.

This is the true picture of the current AI chip market: NVIDIA is the sole leader, prices are skyrocketing, and supply cannot meet demand. NVIDIA’s data center revenue is expected to reach $110 billion in 2024, more than doubling from $42 billion in 2023. This figure is projected to rise to $173 billion in 2025.

But the question is, how long can this monopoly last?

NVIDIA’s Absolute Dominance

Let’s look at some numbers.

NVIDIA holds over 90% of the data center GPU market share. It is expected to sell 7 million GPUs in 2025, including 5 million from the Blackwell series and 2 million from the Hopper series. What does this mean? Nine out of ten computing resources used for training large models and running AI inference globally are powered by NVIDIA chips.

The performance of the GB200 is indeed astonishing. In benchmark tests for GPT-3 (with 175 billion parameters), the GB200’s performance is seven times that of the previous generation H100, and its training speed is four times faster. The actual performance improvement from upgrading from H200 to GB200 is 18 times. This is not just a minor upgrade; it is a true generational leap.

The concentration of customers is also remarkable. The four tech giants—Microsoft, Meta, Google, and Amazon—account for a significant portion of NVIDIA’s data center revenue. Microsoft has ordered between 700,000 to 1.4 million Blackwell systems; Google and Amazon have ordered 400,000 and 360,000 systems, respectively. These orders amount to hundreds of billions of dollars.

NVIDIA’s data center revenue is expected to reach $110 billion in 2024 and $173 billion in 2025. To put this in perspective, this figure was only $42 billion in 2023. In just two years, it has quadrupled.

The stock price has also skyrocketed. At the beginning of 2023, NVIDIA’s stock price was $36, and by October 2025, it had risen to around $181, an increase of over four times. Its market capitalization has surpassed $4.4 trillion, firmly securing its position as the highest-valued company globally.

This growth is driven by the global demand for AI computing power. The combined capital expenditure of Microsoft, Google, Meta, and Amazon is expected to rise from $147.2 billion in 2023 to $228.3 billion in 2024, an increase of 55%. Most of this money is spent on purchasing GPUs and building data centers. It is projected that by 2029, global data center spending will grow from $430 billion in 2024 to $1.1 trillion.

NVIDIA stands at the center of this wave.

CUDA: The Invisible Moat

But NVIDIA’s true moat is not its hardware; it is CUDA.

CUDA is the parallel computing platform and programming model developed by NVIDIA, which has been around for nearly 20 years since its launch in 2006. Almost all AI frameworks worldwide—PyTorch, TensorFlow, JAX—are built on top of CUDA. The first toolchain that all AI engineers and researchers learn is CUDA.

What does this mean? It means that if you want to switch chips, it is not as simple as just changing hardware; you have to rewrite code, retune, retest, and retrain your team. The cost is prohibitively high.

For example, suppose you are a large model company currently using NVIDIA’s H100, which has been running stably for six months, and your team is very familiar with CUDA. Now AMD says, “My MI300X has better performance and is 20% cheaper; would you like to switch?”

It sounds tempting, right? But the problem is, after switching, your entire training pipeline needs to be adapted to AMD’s ROCm platform, and all the CUDA optimization experience you accumulated previously is lost. Your team has to relearn new tools, and the model needs to be retuned. This process could take months, during which training efficiency will decline, and project timelines will be delayed.

In the end, the 20% price advantage is not enough to cover the migration costs.

This is the power of CUDA. It is not a technical barrier; it is an ecological barrier. NVIDIA has spent nearly 20 years turning CUDA into the “de facto standard” in the AI industry. There are over 4 million CUDA developers globally, with thousands of applications and libraries based on CUDA. This moat is more effective than any patent.

More importantly, NVIDIA continues to strengthen this moat. The CUDA 12 version, set to launch in 2024, will include more optimizations for large model training and inference. NVIDIA invests over $3 billion annually in the CUDA ecosystem, including developer tools, technical support, and education and training.

This ecological advantage is the biggest headache for AMD, Intel, and Huawei. Hardware can catch up, but ecosystems take time to build, and time is precisely what they lack.

Challenges from Competitors

However, this does not mean that no one is challenging NVIDIA.

AMD is the most promising contender. Its Instinct MI300 series, especially the MI325X, boasts impressive specifications: 256GB HBM3E memory, 6TB/s bandwidth, and AI performance up to 1.3 times that of NVIDIA’s H200, with HPC performance up to 2.4 times. AMD plans to launch the MI350 series based on the CDNA 4 architecture in 2025, which is expected to improve AI inference performance by 35% compared to MI300.

AMD has also secured some major clients. Microsoft Azure plans to deploy over 200,000 MI300X chips, and Meta has also placed orders. This was previously unimaginable—major clients have always used NVIDIA.

But AMD’s problems are also evident: its software ecosystem is too weak. Although the ROCm platform is continuously improving, it still lags far behind CUDA. Many developers complain that ROCm’s documentation is not detailed enough, the toolchain is not stable, and it is difficult to find solutions when problems arise.

Intel is in a worse situation. Its Gaudi 3 AI accelerator has been slow to sell since its release in April 2024, leading to inventory buildup. To boost sales, Intel even redesigned a hybrid AI rack solution combining Gaudi 3 and NVIDIA’s B200—where B200 handles the pre-filling stage during inference, and Gaudi 3 takes on the decoding stage.

This solution is creative but also reflects desperation. It essentially admits that it cannot compete with NVIDIA alone and must find a “partner” to run alongside.

Intel’s problem is not with the hardware. The Gaudi 3 is based on TSMC’s 5nm process, equipped with 128GB HBM2E memory and 3.7TB/s bandwidth, and its performance is actually quite good. However, it falls far behind NVIDIA in software ecosystem and customer support. Customers are hesitant to use it, and developers are unwilling to learn it, creating a vicious cycle.

In terms of market share, AMD and Intel combined hold only about 10%. To shake NVIDIA’s 90% share, they have a long way to go.

Chinese Chips: Surviving in the Cracks

The situation for Chinese AI chips is even more challenging.

First, let’s talk about Huawei. The Ascend 910C is Huawei’s most advanced AI chip, expected to enter mass production in 2025. However, according to multiple supply chain sources, the mass production of the chip faces challenges in yield.

Yield is a core indicator in semiconductor manufacturing—out of every 100 chips produced, how many can pass quality control and be used normally. The industry generally believes that to achieve large-scale commercial use of AI chips, the yield needs to exceed 70%; otherwise, production costs will be prohibitively high.

The previous generation, the 910B, also faced an unfavorable situation. Supply chain insiders revealed that the yield of the 910B has long hovered at a low level, forcing Huawei to adjust its production pace. ByteDance placed a large order with Huawei in 2024, but the actual delivery progress fell far short of expectations, with a significant gap.

By mid-2025, Huawei has set ambitious production targets for both the 910C and 910B, but whether it can deliver on time remains uncertain. Many customers have reported long lead times, and the supply tightness is unlikely to ease in the short term.

The root cause of the low yield is the lack of advanced lithography equipment. SMIC produces the 910C using the “N+2” process, which barely works, but without ASML’s EUV lithography machines, the yield cannot improve. This is a fundamental issue caused by technology supply chain disruptions, which cannot be resolved in the short term.

Other domestic AI chip manufacturers are also struggling but are making efforts to break through.

In the first half of 2025, Cambrian’s revenue reached 2.881 billion yuan, a year-on-year increase of 4347.82%, with a net profit of 1.038 billion yuan, finally turning a profit. Haiguang Information’s revenue reached 5.464 billion yuan, a year-on-year increase of 45.21%, with a net profit of 1.201 billion yuan.

The numbers look good, but it is important to note that this was achieved under domestic market policy support and the demand for localization. In the global market, their share is still very small. According to Bernstein Research, in the Chinese AI accelerator market in 2024, NVIDIA holds 66%, Huawei HiSilicon about 23%, AMD about 5%, and domestic manufacturers like Cambrian and Muxi combined hold only about 5-6%.

In terms of technology, domestic manufacturers are also trying to compete through differentiation. Huawei HiSilicon and Cambrian are taking the ASIC customization route, optimizing for specific AI inference scenarios; Haiguang Information, Muxi, Birun Technology, and Moore Threads are pursuing the GPU general computing route, directly competing with NVIDIA. There are also companies like Post-Mo Intelligence that are betting on storage-computing integration technology, attempting to overtake by solving the “memory wall” and “power wall” issues.

However, regardless of the route taken, the core issues remain the same: limitations in advanced processes, weak software ecosystems, and insufficient customer trust. These three mountains are difficult to overcome in the short term.

Custom Chips: Another Track

But for NVIDIA, the real threat may not come from these direct challengers, but from the custom ASIC chips developed by tech giants.

Google’s TPU has now reached its sixth generation, Amazon has the Trainium and Inferentia series, Microsoft has Maia, and Meta is also developing its own AI chips. These chips are specifically optimized for their respective AI workloads, outperforming general-purpose GPUs in terms of performance and energy efficiency in specific scenarios. Google has revealed that the cost of TPU is only one-fifth that of NVIDIA GPUs.

More critically, these giants have massive purchasing volumes, allowing them to spread the R&D costs of custom chips. Google and Meta purchase hundreds of thousands of GPUs each year, and if self-developed chips can save 20-30% in costs, that translates to billions of dollars in savings.

Behind these giants is Broadcom—the invisible giant in the AI ASIC market, holding about 70% market share. Broadcom provides customized ASIC design services for clients like Google, Meta, ByteDance, and OpenAI. In October 2025, OpenAI and Broadcom announced a collaboration worth over $10 billion, one of the largest single orders in AI chip history.

The custom ASIC market is growing rapidly. Morgan Stanley predicts that the AI ASIC market size will grow from $12 billion in 2024 to $30 billion by 2027. This poses a real threat to NVIDIA—if major clients shift to self-developed chips, NVIDIA’s growth ceiling will be reached quickly.

(For a detailed analysis of Broadcom and custom chips, see the upcoming series of articles “Broadcom’s Counterattack: How Custom Chips Challenge NVIDIA.”)

How Long Can the Moat Last?

Returning to the initial question: How long can NVIDIA’s moat last?

The answer must be viewed from short-term, medium-term, and long-term perspectives.

In the short term (1-2 years), NVIDIA’s position is difficult to shake. The CUDA ecosystem is too entrenched; the habits of 4 million developers cannot be changed overnight. The cost of customer migration is too high, and the performance advantages are too obvious. In 2025, NVIDIA will continue to thrive, with data center revenue likely exceeding $170 billion.

The depth of this moat exceeds many people’s expectations. It is not just about creating a chip with better performance; customers will not switch just because of that. Software ecosystems, toolchains, technical support, and customer trust—these invisible barriers are harder to overcome than the hardware itself.

In the medium term (3-5 years), uncertainties will increase. AMD is catching up, albeit slowly, but it is indeed making progress. Chinese chips will slowly gain market share under the dual push of policy and market demand. Most importantly, the self-developed ASICs of tech giants are encroaching on NVIDIA’s territory, especially in the inference market.

Inference is a key battlefield. Currently, the demand for AI computing power is roughly split 30% training and 70% inference, but by 2026, the demand for inference computing power is expected to exceed training, potentially reaching over 70%. Inference chips do not require the same high performance as training chips, but they have stricter requirements for cost and energy efficiency. This is the natural advantage of ASICs and the area where NVIDIA is most likely to lose ground.

In the long term (5-10 years), the AI chip market will diversify. General-purpose GPUs, dedicated ASICs, and storage-computing integrated chips will coexist. NVIDIA will continue to dominate the training market, but it will face increasingly fierce competition in the inference market. Its market share may drop from the current 90% to 60-70%, still a leader but no longer “the only one.”

The entire data center AI chip market is rapidly expanding. AMD predicts that this market will grow from about $150 billion in 2024 to $400 billion by 2027. The pie is large enough to accommodate multiple players. However, it is important to understand that challenging NVIDIA is not about a single breakthrough but a systematic project.

Hardware, software, ecosystem, and customer relationships are all essential. AMD has been chasing for over a decade, Intel has struggled with several generations of products without making headway, and Huawei is hindered by technology supply chain issues. The players that are truly capable of changing the landscape are those with sufficient scale, clear needs, and long-term patience—Google, Microsoft, Meta, Amazon, and Broadcom behind them.

NVIDIA’s moat is still deep, but it is not insurmountable. The war has just begun.

The AI Chip War: How Long Can NVIDIA’s Moat Last?

Leave a Comment Cancel reply

Related posts

Leave a Comment Cancel reply