In 2025, the wave of AI continues to sweep across the globe, with computing power becoming the oil of the new era, and the competition for computing power is rewriting the entire semiconductor landscape.Among them, NVIDIA, with its GPUs, has almost monopolized the AI training field, occupying over 90% of the market share, with a market value exceeding $4.5 trillion, becoming the new leader in the semiconductor industry.However, NVIDIA’s position is not unassailable; companies like AMD, Broadcom, and Intel are eyeing the opportunity to seize power from NVIDIA, leading to a new competitive landscape in the AI chip industry. NVIDIA has built a strong moat with its software and ecosystem, nearly monopolizing the upstream system of AI training; meanwhile, other chip giants and cloud vendors are quietly seeking new breakthroughs.AISC and Arm seem to be their targets.
Intel
In recent years, Intel has not been faring well.
For many years, this “blue giant” has struggled to keep up with competitors like TSMC in chip manufacturing capabilities, and its product line in the AI market lacks competitiveness. In contrast, NVIDIA’s AI chips are selling like hotcakes, and AMD also has its own AI chip lineup, while Intel’s next “big bomb” Jaguar Shores won’t debut until 2027, exposing its lag in AI.
In a difficult situation, Intel has chosen to take a differentiated path.
According to reports, Intel recently established a Central Engineering Group (CEG), integrating all engineering talent within the company into one department, led by former Cadence Systems executive Srini Iyengar. This executive, who joined from Cadence in July 2024, has extensive experience in promoting custom chip business models, focusing on IP business, design tools, design ecosystem partnerships, and vertical markets for custom chips. His experience and market connections are believed to accelerate Intel’s ability to leverage the “ASIC boom”.
Intel’s CEO Pat Gelsinger clearly stated during the third-quarter earnings call that the CEG group will lead the expansion of new ASIC and design services, providing dedicated chips for a wide range of external customers. “This will not only expand the coverage of our core x86 IP but also leverage our design advantages to offer a range of solutions from general-purpose to fixed-function computing.” This statement reveals Intel’s strategic ambition—to transform from a pure chip manufacturer to a one-stop service provider offering “design + manufacturing + packaging”.
Intel’s biggest competitive advantage in the ASIC field lies in its complete industrial chain. As a long-established IDM company, Intel possesses chip expertise, x86 IP, and an in-house foundry that provides manufacturing services, allowing customers seeking custom AI chips to receive a “one-stop” service that meets all their needs. This is an advantage that no other ASIC design company in the market can offer, even Broadcom and Marvell find it hard to match. More importantly, with the CEG group, Intel has achieved centralized horizontal engineering, significantly reducing the overhead of connecting design services with manufacturing + packaging.
Reports indicate that Intel’s ASIC business could turn its foundry into a successful service provider, making it an attractive choice for large tech companies. There are many opportunities in the AI supply chain, such as revenue from mass production manufacturing profits and even ASIC design fees. If executed properly, the custom chip business could become Intel’s next mainstay, as it would grant Intel the status of a system foundry responsible for every link in the supply chain.
However, the challenges Intel faces are equally enormous. NVIDIA recently announced a $5 billion investment to acquire about 4% of Intel’s shares, and the two companies will jointly develop “multi-generation custom data center and PC products.” This collaboration brings opportunities for Intel but also introduces a complex competitive relationship. The data center chips will be x86 chips customized by Intel according to NVIDIA’s specifications, and NVIDIA will “integrate these CPUs into its AI infrastructure platform and bring them to market.” In the consumer market, Intel plans to create an x86 SoC that integrates Intel CPUs and Nvidia RTX GPU chipsets, which means Intel may use NVIDIA-designed graphics chips in future products instead of its own Arc GPUs.
This raises a series of unresolved questions. Intel has been developing its own graphics products for decades, and its recently launched Arc brand dedicated graphics cards and integrated GPUs pose a direct challenge to some of NVIDIA’s lower-end products. Intel told the media that the company “will continue to provide GPU products,” but this may mean that Intel will focus on low-end, low-power GPUs while leaving high-end products to NVIDIA. In terms of software, Intel has been promoting its own oneAPI graphics computing stack as an alternative to NVIDIA’s CUDA and AMD’s ROCm, but the future of this platform is also fraught with uncertainty.
A more critical issue is manufacturing; the probability that NVIDIA will use Intel’s 18A process or other processes on Intel’s roadmap to produce some chips is actually quite low. Intel has been struggling to find major customers, but Jensen Huang praised TSMC when answering related questions, stating, “TSMC’s capabilities, from process technology, execution pace, capacity, and infrastructure scale to business operation agility… all this magic comes together to create a world-class foundry capable of supporting such diverse customer needs. TSMC’s magic is truly indescribable.” This suggests that NVIDIA is unlikely to turn to Intel’s foundry on a large scale in the short term.
For Intel, shifting to ASIC design services is a necessary choice to find new growth curves in a difficult situation. Having missed the opportunity in the AI hype, Intel hopes to find its place in the AI chip market by providing complete design and manufacturing services. However, this is no easy task, especially in a fiercely competitive AI market and with ASIC design companies like Broadcom continuously evolving. Whether Intel can seize this opportunity will determine whether this once-dominant chip giant can rise again in the AI era.
Qualcomm
In contrast to Intel’s somewhat helpless situation, Qualcomm’s choice is quite aggressive.
This company, which has so far focused on wireless connectivity and mobile device semiconductors, is making a big push into the large data center market, directly challenging NVIDIA and AMD’s positions in AI inference. Recently, Qualcomm announced the release of new AI accelerator chips AI200 and AI250, and after the news broke, Qualcomm’s stock soared 11%, indicating high market recognition of this transformation.
According to reports, the AI200, which Qualcomm plans to launch in 2026, and the AI250, scheduled for 2027, can be integrated into systems filled with liquid-cooled server racks, marking Qualcomm’s entry into the data center field as a new competitor in the fastest-growing market in the tech sector. According to McKinsey, capital expenditures for data centers are expected to approach $6.7 trillion by 2030, with most of that going towards AI chip-based systems.
Reports indicate that Qualcomm’s data center chips are based on the AI components in Qualcomm’s smartphone chips, known as the Hexagon Neural Processing Unit (NPU). In recent years, the company has been gradually improving its Hexagon NPU, so the latest versions of these processors are equipped with scalar, vector, and tensor accelerators (in a 12+8+1 configuration), supporting data formats such as INT2, INT4, INT8, INT16, FP8, FP16, as well as micro-batch inference to reduce memory traffic, 64-bit memory addressing, virtualization, and Gen AI model encryption for additional security. For Qualcomm, extending Hexagon to data center workloads is a natural choice.
Qualcomm’s General Manager of Data Center and Edge Computing, Durga Malladi, stated during a conference call with reporters: “We first want to prove ourselves in other areas, and once we establish strength there, it will be easy for us to move up to the data center level.” This statement reveals Qualcomm’s strategic logic—extending from mobile AI capabilities to the data center market.
It is understood that Qualcomm’s AI200 rack-level solution is equipped with 768GB LPDDR memory, which is already a considerable memory capacity for inference accelerators, surpassing NVIDIA and AMD’s products. The system will use PCIe interconnect for vertical scaling and Ethernet for horizontal scaling. The system will adopt direct liquid cooling, with a power consumption of up to 160kW per rack, which is also unprecedented for inference solutions. Additionally, the system will support confidential computing for enterprise deployments. This solution is set to launch in 2026.
The AI250, set to launch in 2027, will retain this architecture but add near-memory computing architecture, effectively increasing memory bandwidth by more than ten times. Furthermore, the system will support decomposed inference functionality, allowing computing and memory resources to be dynamically shared across different cards. Qualcomm positions it as a more efficient, high-bandwidth solution optimized for large Transformer models while retaining the same cooling, thermal, security, and scalability features as the AI200.
Qualcomm has made it clear that its chips focus on inference or running AI models, rather than training. This is a wise differentiation strategy, avoiding the highly competitive training market dominated by NVIDIA. Labs like OpenAI create new AI capabilities by processing terabytes of data, which requires powerful training chips, while Qualcomm chooses to focus on running and deploying pre-trained models, a similarly large but relatively less competitive market.
Qualcomm states that its rack systems will ultimately reduce operating costs for customers such as cloud service providers, and that one rack’s power consumption of 160kW is comparable to the high power consumption of some NVIDIA GPU racks, but can provide a better performance-to-power ratio in inference scenarios. Qualcomm also emphasizes its advantages over other accelerators in terms of power consumption, ownership costs, and new methods of memory processing.
Malladi emphasized that Qualcomm will also sell its AI chips and other components separately, especially targeting those large-scale data center customers who prefer to design their own racks. He stated that other AI chip companies, such as NVIDIA or AMD, may even become customers for some of Qualcomm’s data center components. “We are trying to ensure that our customers can choose to buy everything or say, ‘I want to mix and match.'” This flexible business model opens up more market space for Qualcomm.
Qualcomm’s market validation has already begun. In May 2024, Qualcomm announced a partnership with Humain in Saudi Arabia to provide AI inference chips for data centers in the region. Humain will become a customer of Qualcomm, committing to deploy systems that can use up to 200 megawatts of power.
In addition to building hardware platforms, Qualcomm is also constructing a large-scale end-to-end software platform optimized for large-scale inference. This platform will support major machine learning and generative AI toolsets, including PyTorch, ONNX, vLLM, LangChain, and CrewAI, while enabling seamless model deployment. The software stack will support decomposed services, confidential computing, and one-click loading of pre-trained models to simplify deployment.
Malladi stated: “Our rich software stack and open ecosystem support make it easier than ever for developers and enterprises to integrate, manage, and scale pre-trained AI models on our optimized AI inference solutions. Qualcomm AI200 and AI250 are seamlessly compatible with leading AI frameworks and support one-click model deployment, aiming for seamless applications and rapid innovation.”
Qualcomm’s shift to the data center AI inference market is multifaceted. First, the industry has long been dominated by NVIDIA, whose GPUs still account for over 90% of the market share, but companies like OpenAI have been seeking alternatives. Google, Amazon, and Microsoft are also developing their own AI accelerators for their cloud services, creating opportunities for new entrants. Second, the scale of the inference market is rapidly growing, as more and more AI models are deployed in production environments, and the demand for inference will far exceed that for training. Third, Qualcomm’s Hexagon NPU technology accumulated in the mobile sector provides a technical foundation for its entry into the data center, representing a natural extension from edge to cloud.
Qualcomm’s release of new AI chips essentially blurs the traditional market boundaries, allowing mobile chip manufacturers to enter the data center, while data center chip manufacturers are also extending to edge devices, forming a new competitive landscape where companies are intertwined.
MediaTek
Coincidentally, MediaTek, another mobile chip manufacturer, is also entering the AI space. This traditional mobile chip company is becoming an important player in cloud ASIC design services, directly competing with ASIC market leaders like Broadcom, and has already secured orders from tech giants like Google and Meta.
As early as last year, MediaTek announced a partnership with NVIDIA, and at this year’s NVIDIA GTC conference, MediaTek introduced its Premium ASIC design services, showing that its collaboration with NVIDIA has expanded into the IP domain, offering a more flexible business model that can provide various customized chips/HBM4E, along with a rich Cell Library and advanced process and packaging experience, delivering complete customized chip solutions.
MediaTek’s core competitiveness lies in its SerDes technology. MediaTek points out that its SerDes technology is a core advantage for ASICs, covering chip interconnects, high-speed I/O, advanced packaging, and memory integration. Among them, the 112Gb/s DSP (Digital Signal Processor) is based on a PAM-4 receiver, built on a 4nm FinFET process, achieving over 52dB loss compensation, meaning lower signal attenuation and stronger anti-interference characteristics. This technology is not only suitable for Ethernet and long-distance optical transmission, but MediaTek has also launched a 224G SerDes specifically for data center use, which has already completed silicon validation.
Recently, MediaTek also officially announced a collaboration with NVIDIA to design the GB10 Grace Blackwell super chip, which will power the newly launched NVIDIA DGX Spark. DGX Spark is a personal AI supercomputer designed to help developers prototype, fine-tune, and infer large AI models on their desktops.
It is understood that the GB10 Grace Blackwell super chip is composed of the latest generation Blackwell GPU and a 20-core Arm CPU from Grace, utilizing MediaTek’s expertise in designing energy-efficient, high-performance CPUs, memory subsystems, and high-speed interfaces. This configuration provides 128GB of unified memory and delivers up to 1 PFLOP of AI performance to accelerate model tuning and real-time inference. This enables developers to locally process large AI models with up to 200 billion parameters. Additionally, the system features built-in ConnectX-7 networking technology, allowing two DGX Spark systems to be connected together for inference on models with up to 405 billion parameters. The energy efficiency of DGX Spark is sufficient to use standard power outlets, and its compact design allows it to be easily placed on a desktop.
In addition to its collaboration with NVIDIA, MediaTek is also following in the footsteps of Broadcom and Marvell to compete in the cloud service provider market. According to research firms, some CSPs are evaluating the customized design chips from NVIDIA and MediaTek’s IP portfolio. Although Google’s TPU (Tensor Processing Unit) progress has been slightly delayed, the seventh-generation TPU is expected to enter mass production in the third quarter of next year, and its 3nm design is still expected to contribute over $2 billion to MediaTek. The supply chain also reveals that Google’s eighth-generation TPU will begin to adopt TSMC’s 2nm process, maintaining a leading position in advanced processes.
Another significant breakthrough for MediaTek comes from Meta. MediaTek and Broadcom continue to compete for Meta’s new dedicated integrated circuit (ASIC) project, with industry insiders emphasizing that the performance of the two companies is quite comparable. However, recent reports indicate that MediaTek is about to secure a large order for a 2nm process ASIC from Meta, codenamed “Arke,” focusing on post-training and inference capabilities, which may achieve mass production in the first half of 2027.
According to IC design companies, MediaTek’s victory in this product competition will mark its second significant order from cloud service provider (CSP) customers. Industry insiders familiar with the ASIC field point out that Arke was not originally part of Meta’s initial plan. After the Iris chip was scheduled for mass production by the end of 2025, Meta planned to launch another ASIC using the N2P process, named Olympus. However, considering actual demand and cost-effectiveness, Meta introduced a chip specifically for inference, Arke, midway through the product release plan. As a result, Olympus will be repositioned as an ASIC designed for training to compete with NVIDIA’s future GPUs, with its release date pushed back to 2028.
Meta’s previous products were mainly developed by ASIC market leader Broadcom. However, MediaTek has already established a cooperative relationship with Meta. For example, the smart glasses chip independently developed by Meta was co-developed with MediaTek, laying a solid foundation in the ASIC field. Therefore, MediaTek’s potential favor from Meta for the new Arke product is not entirely unexpected.
Industry insiders state that after stabilizing its relationship with Google, MediaTek needs to expand its cooperation scope to establish a greater influence in the cloud ASIC market. Recently, the market has observed changes in the ASIC design strategies and plans of CSP giants. Although the usage of cloud AI remains enormous and supply tight, CSPs have adjusted their strategies to enhance cost-effectiveness. Previously, technical compliance and integration capabilities were prioritized, often overlooking costs. Now, with clearer insights into the actual dynamics of the cloud AI market and chip design details, CSPs are also committed to developing more practical and economical products. In this broader context, MediaTek’s cost advantages are gradually becoming apparent.
MediaTek’s shift to ASIC is related to its unique market positioning. As a Taiwanese chip design company, MediaTek faces fierce competition in the mobile chip market, with profit margins under pressure. ASIC design services provide MediaTek with higher profit margins and more stable customer relationships. At the same time, MediaTek’s technological accumulation in advanced processes, high-speed interfaces, and memory integration enables it to offer differentiated solutions to cloud service providers. More importantly, MediaTek’s collaboration with NVIDIA has granted it access to the high-end AI market, which would be difficult to achieve independently.
AMD
Compared to other manufacturers, AMD’s actions in the ASIC field are relatively low-key, but its development of Arm-based products shows the company’s strategic thinking about the future market. According to leaked information from a well-regarded industry magazine, AMD is developing an Arm-based APU codenamed “Sound Wave,” which is expected to be released later next year.
This article, titled “AMD is Developing an Arm-Based APU, Codenamed Sound Wave,” was leaked and even included some customs declaration forms showing the size of the package. For some time, there have been rumors that AMD is developing an Arm-based device, but this latest leaked article reveals its approximate specifications, including a relatively small 32mm x 27mm BGA package containing six CPU cores (two P-cores + four E-cores) and an RDNA architecture GPU, making it seem more realistic. Currently, circuit boards for evaluating electrical characteristics are being shipped.
From the compact packaging, the device seems to target mobile applications and will fully utilize the power-saving features of the Arm architecture. AMD, which shares the x86 architecture CPU market with Intel, faces fierce competition in the PC/server market, but in recent years, AMD has been collaborating with TSMC to bring high-performance CPUs based on the Zen architecture to market, steadily capturing market share from Intel.
CEO Lisa Su has long pursued a product strategy of consolidating high-end market positions through the x86 architecture to ensure higher profit margins, but now it seems time to integrate the increasingly expanding Arm architecture into its CPUs. For AMD, which supplies high-performance CPUs/GPUs to the data center market, it seems to have recognized the future growth area of AI workloads on edge devices.
AMD had previously developed Arm-based CPUs, but that development ultimately resulted in a one-off server CPU product under the Opteron brand named “A1100.” AMD entered the server market with the Opteron brand based on the K8 architecture in 2003. After that, they further upgraded the pipeline structure and attempted to consolidate their position with the high-frequency-focused Bulldozer core architecture. However, the actual performance of the products did not improve, and this attempt ultimately failed. As a result, AMD long lacked sufficient products to compete with Intel in the market.
AMD’s return to the server market was delayed until 2017, when it launched Zen architecture products. The A1100 Arm architecture server CPU was a power-efficient server processor developed by AMD after repeated trials during difficult times. At that time, the Arm architecture had not yet been accepted in the server market, and the market was not very attractive, but AMD launched the K12 project as a follow-up architecture.
The K12 project aimed to launch a platform whose decoder was compatible with both x86 and Arm instruction sets and pin-compatible with both x86 and Arm. At that time, AMD referred to it as “dual-architecture computing” and even released a technical overview. However, the K12 project was ultimately canceled before its release, as management decided to prioritize the development of the Zen architecture to regain dominance in the x86 market. Jim Keller, the current CEO of AI processor startup Tenstorrent, was responsible for AMD’s architecture development engineering work at that time, and in an interview reflecting on that time, he stated, “That was a serious management mistake.”
NVIDIA’s recent announcement of acquiring shares in Intel and collaborating in the x86 market is certainly a boon for both companies, but at the same time, AMD, rooted in the x86 market, has also developed a new sense of crisis.
In fact, the environment AMD finds itself in has changed dramatically over the past decade. On one hand, the mainstream of technological innovation has shifted from CPUs to GPUs, and on the other hand, with the development of chiplet architecture, the concept of pin compatibility has become outdated. Most importantly, AMD now has the financial resources to develop two different architectures simultaneously.
For AMD, betting on Arm seems to be one of the most reasonable choices in the face of its GPUs struggling to compete directly with NVIDIA in the short term.
Giants Turning
Why are the giants turning their backs?
In fact, as AI development enters deeper waters, the generality that GPUs once prided themselves on has become a constraint. With the exponential increase in the number of parameters and deployment scale of AI models, ASICs and Arm have been given the opportunity to take the stage.
From a fundamental perspective, the reason giants are betting on Arm and ASIC is that the demand for computing power in the AI era has shifted from “general computing” to “specialized computing”.
The reason GPUs dominated in the early days was that they provided sufficient parallel computing power for AI training, flexibly responding to the training needs of different models; however, as AI models enter the deployment and inference phase, energy consumption, latency, and cost have become new key constraints, and the massive architecture of general-purpose GPUs has instead brought redundancy. ASICs achieve extreme energy efficiency through “customized computing paths,” utilizing every transistor for the most critical computational tasks without sacrificing performance.
At the same time, the Arm architecture has become a natural extension of this trend. Its low power consumption and high scalability make it favored in AI inference, edge computing, and smart terminals. Whether it is Amazon and Microsoft or Google and Meta, they are proving that the hegemony of x86 is being weakened, while Arm’s flexible licensing model and open ecosystem are becoming the new foundation for AI infrastructure.
For traditional giants, the strategic shift towards Arm and ASIC is not merely a matter of “chasing trends,” but a structural transformation aimed at breaking through bottlenecks and striving for a larger market:
Intel hopes to use ASIC custom services as a breakthrough to bridge the gap in AI chip foundry and design, leveraging its IDM model to build a system-level competitive advantage in design + manufacturing + packaging;
Qualcomm aims to extend from edge AI to cloud inference, leveraging its Hexagon NPU accumulation to reshape the energy efficiency structure of data centers through low-power ASIC systems;
MediaTek is leveraging its high-speed SerDes and memory integration advantages to enter the AI ASIC supply chain of CSPs, winning orders from Google and Meta with high performance + cost-effectiveness;
AMD is exploring new types of APUs based on Arm architecture, attempting to establish differentiated advantages in PC and low-power AI scenarios to avoid being completely locked out by NVIDIA and the x86 ecosystem.
Finally, the deeper reason is that the value center of the AI chip industry is decentralizing. In the past, chip companies sold products, but now they sell capabilities—computing power, IP, design services, and ecosystem interfaces. Customized ASICs and licensable Arm architectures happen to form the underlying carriers of this decentralization, allowing different companies to redefine competitive rules in specific scenarios.
Therefore, as the golden age of GPUs enters a bottleneck, the competition for AI computing power is quietly diverging: one path leads to “more general, more expensive” high-end GPU computing; the other leads to “more specialized, more efficient” ASIC + Arm systems.
Looking ahead, AI infrastructure will increasingly lean towards specialized chips rather than general-purpose chips. In this case, those who can secure more orders from cloud giants are likely to dominate the semiconductor industry in the next decade.