432-Core RISC-V Chip Successfully Taped Out for Space Computing

432-Core RISC-V Chip Successfully Taped Out for Space Computing

The European Space Agency (ESA) is exploring various methods to enhance space computing capabilities, and one of the processors it supports is about to be released.Researchers from the agency announced at last month’s European Design, Automation and Test Conference that the Occamy processor, developed by engineers from ETH Zurich and the University of Bologna, is nearing completion. Development of this chip began on April 20, 2021, and after successful tape-out in July 2022, it is currently undergoing final packaging. The ESA is also a member of the open-source processor group.Some known features of the chipIt is reported that this chip belongs to the Parallel Ultra-Low Power (PULP) platform project, featuring two computing units (CPUs), each utilizing a Chiplet design with 216 32-bit RISC-V cores, an unknown number of 64-bit Floating Point Units (FPUs), and two 16GB HBM2E memory chips from Micron. The cores of this processor are interconnected through an intermediary layer, with the dual CPUs estimated to achieve peak performance of 0.768 TFLOp/s for FP64, 1.536 TFLOp/s for FP32, 3.072 TFLOp/s for FP16, and 6.144 TFLOp/s for FP8 precision.432-Core RISC-V Chip Successfully Taped Out for Space ComputingIn this chip, developers have combined a small, ultra-efficient ordered 32-bit RISC-V integer core named Snitch with a large multi-precision FPU enhanced by Single Instruction Multiple Data (SIMD) to achieve functionality in the following FP formats: FP64 (11,52), FP32 (8,23), FP16 (5,10), FP16alt (8,7), FP8 (5,2), and FP8alt (4,3). In addition to standard RISC-V fused multiply-add (FMA) instructions, the two 8-bit and two 16-bit FP formats also feature new extensions and dot product and three-add instructions (exsdotp, exvsum, and vsum).To achieve ultra-efficient computation on data-parallel FP workloads, developers utilized two custom architecture extensions: data prefetch register file entries and a repeat buffer. The corresponding RISC-V ISA extensions, Stream Semantic Registers (SSR) and FP Repeat Instructions (FREP), enable the Snitch core to achieve over 90% FPU utilization for compute-bound cores.432-Core RISC-V Chip Successfully Taped Out for Space ComputingPartial die view of OccamyEach Occamy Chiplet contains over 216 Snitch cores, organized into four computing clusters. Each cluster shares tightly coupled memory between eight computing cores and one high-bandwidth (512-bit) DMA-enhanced core for orchestrating data flow. AXI-based wide multi-level interconnects and dedicated DMA engines help manage the massive on-chip bandwidth. RISC-V cores supporting CVA6 Linux manage all computing clusters and system peripherals. Each Chiplet has a private 16GB high-bandwidth memory (HBM2e) and can communicate with adjacent Chiplets via a 19.5 GB/s wide, source-synchronous technology independent die-to-die DDR link.432-Core RISC-V Chip Successfully Taped Out for Space ComputingPartial die view of OccamyOccamy is a low-power chip designed for AI and high-performance computing workloads, with its lightweight 32-bit CPU cores functioning more like a control chip, responsible for rerouting tasks to AI cores. Today’s AI workloads heavily rely on accelerators such as GPUs and AI cores for training and inference, and researchers hope that the open-source chip can also be used for AI workloads in space.432-Core RISC-V Chip Successfully Taped Out for Space ComputingA single Occamy chip operates at a frequency of 1GHz with a runtime power consumption of 10 watts, so two chips plus HBM memory would more than double the final chip’s power consumption. The ESA and its development partners have not disclosed the specific power consumption of Occamy, but it is said that the chip employs passive cooling, suggesting it may be a low-power processor.Serendipitous Chiplet DesignThis 432-core chip is an interesting blend of new and old technologies, with one of the advantages of the currently popular Chiplet design being the ability to mix and match new and old technologies within the chip package, such as analog or digital processors, and to add additional functional modules in the package later to accelerate certain workloads as needed. Each Occamy chip contains 216 RISC-V cores and FPUs for matrix computations, with a total of approximately 1 billion transistors distributed across this 72mm² chip, roughly comparable to Intel’s quad-core Sandy Bridge chip manufactured in 2011 (about three times larger).Occamy is manufactured using Global Foundries’ 12LPP (12nm low power) process, with the Chiplet design placed on a passive 65nm intermediary layer.ESA states that the Occamy project was initially an accidental outcome of its Manticore high-performance architecture concept proposed at the 2020 Hot Chips conference. The current research prototype aims to demonstrate and explore the scalability, performance, and efficiency of RISC-V-based architectures in 2.5D integrated Chiplet systems, showcasing GlobalFoundries’ technology and its IP ecosystem, as well as the IP ecosystems of Rambus (HBM2e controller IP and integration support) and Micron (HBM2e DRAM supply and integration support).Additionally, Synopsys has provided strong support for EDA tool licensing, and Avery has supported the HBM2e DRAM validation model, making this project possible.432-Core RISC-V Chip Successfully Taped Out for Space ComputingIn comparison, Intel’s Alder Lake die measures 163 mm². In terms of performance, NVIDIA’s A30 GPU features 24GB of HBM2 memory, providing 5.2 FP64/10.3 FP64 Tensor TFLOPS and 330/660 (sparsity) INT8 TOPS.Part of the European Autonomous Chip ProgramAccording to public information, Occamy is being developed as part of the ESA’s EuPilot project, which is one of many chips being considered for space computing. The project aims to deliver the first HPC system based on fully European open-source and open-standard software and hardware integration by creating a set of autonomous accelerators designed, implemented, manufactured, and deployed in Europe. The accelerators will be manufactured using new European Global Foundries advanced process technologies to demonstrate the independence of European technology.The EuPilot project is developing indigenous processors to reduce reliance on proprietary x86 and Arm architecture chips, and is also developing autonomous controllable chips for supercomputers, artificial intelligence, the Internet of Things, and autonomous vehicles.432-Core RISC-V Chip Successfully Taped Out for Space ComputingESA is interested in these chips as they will allow devices in space to perform on-chip data analysis. While it cannot be guaranteed that ESA will deploy this chip in space, it is one of many processors being explored for space computing. On the US side, NASA has also adopted RISC-V chips from Microchip and SiFive to upgrade its space computers.It is reported that Occamy can be simulated on FPGAs, with implementations tested on two AMD Xilinx Virtex UltraScale+ HBM FPGAs and Virtex UltraScale+ VCU1525 FPGAs. The researchers designing the Occamy chip hope that the chip design can be adopted and reused at low cost, which may depend on software.Author: Liu Yuwei, Deputy Chief Analyst at Electronic Engineering MagazineEND

432-Core RISC-V Chip Successfully Taped Out for Space Computing

432-Core RISC-V Chip Successfully Taped Out for Space Computing

↓ Click to read the original text and sign up

Leave a Comment