According to reports from Electronic Enthusiasts (by Zhou Kaiyang), data centers, as the most focused market for CPU, GPU, and accelerator manufacturers, naturally cannot miss the involvement of the new architecture RISC-V. In previous articles, we have introduced some progress of RISC-V in data centers. Although RISC-V has not shown significant momentum in general-purpose CPUs for servers, products like AI accelerators may very well be the best direction for RISC-V to penetrate the data center market.
Google’s RISC-V Strategy
One of the server manufacturers that designs its own chips, Google, seems to be planning to adopt RISC-V. At the AI hardware summit in the U.S. this September, the chief architect of SiFive and Google’s TPU chip architect showcased their collaborative solution. Google’s TPU is an accelerator designed specifically for machine learning, used to run machine learning frameworks such as TensorFlow, Pytorch, and JAX in data centers.The main computational unit of the TPU is the Matrix Multiply Unit (MXU), consisting of 128×128 multipliers/accumulators in a pulsed array. The minimum configuration of the v4 TPU includes four TPU chips, each with eight MXUs, which is double that of the TPU v3 version, and each MXU can perform 16K multiply-accumulate operations per cycle using BF16.However, Google found that although the TPU’s machine learning computing power is sufficient, customers often cannot use such large AI accelerators to complete other complex computational loads. Therefore, Google’s approach is to use SiFive’s X280 processor core as a coprocessor for the TPU, providing maintenance and running code for kernels that the accelerator cannot handle.Although SiFive’s X280 is primarily aimed at accelerating AI/ML computations, it is mainly targeted at edge devices, such as AR/VR, digital cameras, etc., and not for large AI accelerators in data centers. However, in collaboration with manufacturers like Google, SiFive has launched a technology called Vector Custom Coprocessor Interface (VCIX), which allows large AI accelerators to communicate directly with the X280’s 32×512-bit vector register file at high speed.This access at the vector register level provides greater bandwidth, lower latency than PCIe, and simplifies the software stack, saving more hardware resources. The X280 and TPU cores work together, with the former responsible for running a complete Linux system and hypervisor, while the latter handles intensive machine learning computations.
Intel’s Horse Creek Emerges
As early as last year, Intel announced that it would use SiFive’s P550 high-performance RISC-V core and its own 7nm process (now known as Intel 4) to create a RISC-V SoC, codenamed Horse Creek. Although by 2022, SiFive’s highest performance core had become the P650, the P550, with its 13-stage pipeline, triple-issue, out-of-order 64-bit RISC-V core, still offers impressive performance, positioned by SiFive to compete with ARM’s Cortex-A75 while occupying less than half the area.However, after the announcement, Intel launched a series of RISC-V related actions, such as joining the RISC-V International Foundation, IFS supporting RISC-V chip foundry, and launching RISC-V FPGA development platforms, but Horse Creek has remained elusive.At this year’s Intel Innovation conference, many of Intel’s partners set up booths on-site, and attendees finally got to see the true face of Horse Creek. With the support of Intel 4 technology, Horse Creek integrates four 2.2GHz SiFive P550 cores, DDR5, and PCIe5 into a single die measuring 4mm x 4mm. Even Intel CEO Pat Gelsinger personally visited the booth to check out the Horse Creek development platform.According to the publicly available data at the booth, Horse Creek is equipped with a three-level cache, including private L2 cache and shared L3 cache. The DDR5 integration includes Intel’s DDR PHY, DFI interface, and Cadence’s DDR memory controller, supporting DDR5 memory frequencies up to 5600. The PCIe 5.0 section integrates Intel’s PCIe PHY and Synopsys’s PCIe Root Hub controller. In addition to the analog IP provided by Intel, such as PLL, memory compiler, and standard cells, Horse Creek also utilizes other IPs, such as Siemens’ DFT and Synopsys’s NOC Fabric.
From the above data, it can be seen that Horse Creek is a SoC that integrates the strengths of SiFive, Intel, and EDA vendor IPs. Such a powerful SoC, combined with rich interface support, can be fully utilized in data centers. However, the first terminal product form of Horse Creek should be SiFive’s next-generation HiFive development board. The previous generation HiFive Unmatched development board has already sold out, and due to pandemic-related supply chain issues, SiFive has abandoned plans for restocking and is instead focusing on creating the next generation of HiFive development boards based on Horse Creek.
Another Thousand-Core RISC-V Chip
At Dell’s HPC Community Conference, the U.S. startup Inspire Semiconductor unveiled their RISC-V accelerator solution for data centers, along with an interesting claim that existing high-performance computing solutions are already “not good enough”.In their view, the current mainstream data center CPUs are too slow, regardless of whether they have accelerator assistance, and after adding accelerators, 90% of high-performance computing is handled by the accelerators. Furthermore, they pointed out that the programming of GPU and FPGA solutions is too complex, locking in the software stack and requiring specific skills to achieve satisfactory computational results. ASICs and AI accelerators also pose too high risks in terms of cost and time.To address this, Inspire Semiconductor launched their RISC-V accelerator solution, Thunderbird. Thunderbird integrates 2560 64-bit CPU cores on a single chip, with the number of cores on a single PCIe accelerator card exceeding 5000. Inspire Semiconductor claims they have utilized an innovative high-speed interconnect solution, allowing for efficient use of so many cores, with the potential to form arrays of up to 256 chips.Compared to other accelerator and GPU solutions, the Thunderbird accelerator also has power consumption advantages, with a single chip consuming around 175W, and they provided an energy efficiency ratio of 20W/Tflops, indicating that Thunderbird’s peak computing power is approximately 8.75Tflops. Although Inspire Semiconductor also showcased feedback from clients or partners such as Google, Lenovo, and IBM at the conference, much of it was polite talk, and it remains unclear whether these have been integrated into the data center solutions of these companies.Inspire Semiconductor also promised a developer-friendly software ecosystem, but they did not provide their own software solution, instead pointing out that Thunderbird will fully utilize the rich existing software ecosystem of RISC-V, such as OneAPI, so there is no need to develop a one-time software stack like competing chips. This makes it more suitable for developers who prefer standard CPU programming models, eliminating the need to learn CUDA or OpenCL like with GPUs, and instead allowing the use of standard compiler solutions like Pragma and MPI.AI programming is similarly supported, as Inspire Semiconductor noted that popular AI frameworks such as TensorFlow, Pytorch, and Glow already have support for RISC-V, and Linux is also among the supported operating systems. Thus, it appears that Inspire Semiconductor currently only offers pure hardware solutions, and whether such products can succeed in the market will largely depend on Intel Codeplay’s OneAPI software ecosystem.
Disclaimer: This article is originally from Electronic Enthusiasts, please indicate the source above when reprinting. For group communication, please add WeChat elecfans999, for submission, interview requests, please email [email protected].
More Hot Articles to Read
- How Autonomous Driving Taxis Shape a Trillion-Dollar Market: What is the Situation of Domestic Industry Development?
- Semi-Conductor Technology Solidifies V2G Foundation, Accelerating the Commercialization Process of Bidirectional Charging
- Domestic Semiconductor Industry on the Brink of Collapse: Russia Bets Big on RISC-V
- Intel and NVIDIA Compete from Afar! AMD Intensifies Efforts to Capture the GPU Market: Who Has the Better Chance?
- The Trillion-Dollar Market for Electric Two-Wheelers: Domestic Manufacturers Accelerate Layout of BMS Chips