Recently, the announcement of the 2018-2019 procurement project for information products (hardware) and air conditioning products by central state organs showed that the CPUs of Loongson, Shenwei, and Phytium have been included in the central government procurement list. Previously, although the CPUs of Loongson, Phytium, and Shenwei had been subject to local pilot projects, they had not been included in the central government procurement list. This event indicates the state’s emphasis on Loongson, Phytium, and Shenwei, as well as recognition of the achievements made during the previous pilot projects.
The procurement list includes more than just 3 CPUs
Previously, media reports indicated that Loongson 3B1500, Shenwei 1621, and Phytium 1500A-16 have been included in the central government procurement list. There is a certain time gap between these CPUs, and it seems uncoordinated to include a product like Loongson 3B1500, which was not successful 6 years ago, alongside the new CPU Shenwei 1621 in the central government procurement list.
Loongson 3B1500 is based on the 3B1000 design, mainly improving performance through process upgrades while making partial performance optimizations. It obtained sample chips in August 2012, with a clock frequency of 1.25GHz—1.5GHz. However, the development process of Loongson 3B1500 encountered many problems, including management issues during the transition from the research group to the company, problems caused by manufacturing, deadlock issues caused by multi-core access, and cache consistency issues in the processor cores.
After overcoming a series of challenges, Loongson found that due to deviations in the research approach, there was an excessive pursuit of multi-core and floating-point peak performance as a single metric, resulting in insufficient general processing performance. In terms of single-threaded SPEC2006 testing, Shenwei 1621 can achieve 12+, while 3B1500 only reaches 3-4, with a performance gap of over 3 times. As for multi-threading, since the core count of Shenwei 1621 is double that of 3B1500, the gap is even larger.
Therefore, media reports made me skeptical about this news. After consulting industry insiders, I was informed that the CPUs included in government procurement are not limited to Loongson 3B1500, Shenwei 1621, and Phytium 1500A-16. Since the declaration was made by the system manufacturers, they conveniently reported “greater than or equal to Loongson 3B1500, Shenwei 1621, Phytium 1500A-16”, meaning that as long as the relevant parameters are met, products after these CPUs, such as Loongson 3B2000, Loongson 3B3000, Phytium 2000, Phytium 2000plus, and ongoing developments like Shenwei 3232, Loongson 3B4000, etc., can theoretically be included in the central government procurement list.
How do the CPUs of Loongson, Phytium, and Shenwei perform?
As mentioned earlier, although Loongson 3B1500 performs quite average in general performance, it has been relatively successful in academia, with papers accepted by top international conferences Hotchips and ISSCC. The US IT Times specifically reported on this, and MIT also commented that the floating-point performance of Loongson 3B exceeds that of Intel processors of the same period.
Due to the lessons learned from 3B1500, Loongson firmly embarked on the path of enhancing single-core performance. The 3A/B2000 improved the general performance of the CPU by 1 time despite the manufacturing process reverting from 32nm to 40nm through core upgrades. The 3A/B3000 further enhanced the general performance of the CPU by 50% by upgrading from 40nm to 28nm and optimizing the CPU cores.
Currently, the single-core score of Loongson 3A/B3000 in SPEC2006 testing is around 11 points.
It is worth mentioning that under the same manufacturing process (3B1500 is 32nm, 3B3000 is 28nm), by enhancing design, the CPU SPEC2006 score improved from 3-4 points to 11 points. In contrast, those technology-imported chips rely heavily on purchasing better CPU cores from abroad (ARM CortexA9, A15, A72, A73 repeatedly bought), or adopting better foreign processes (TSMC 28nm/16nm/10nm/7nm).
Phytium 1500A-16 is a 16-core server CPU designed by Phytium. According to Phytium’s official introduction, the CPU cores are independently developed. However, many industry insiders have differing opinions on this.
Since the R&D cycle of a CPU generally takes 3 years, and it takes another year from sample to finished product, this is under the condition of having the previous generation CPU source code as a basis, and each generation’s code updates generally do not exceed 25%… However, Phytium launched its products and signed sales agreements in a very short time after obtaining ARM authorization, leading the industry to widely believe that Phytium’s products are either integrated by purchasing IP or modified based on ARM public version cores.
After all, writing tens of millions of lines of code is no small task. Moreover, achieving strong performance, stability, and independent research and development within a short time is inherently contradictory. The core performance of Phytium 1500A is roughly equivalent to that of ARM Cortex A57, and considering that Qualcomm and Samsung have also modified based on ARM public version cores, this further fuels speculation.
However, with the development of Phytium 1500A and Phytium 2000, even if they indeed reference the ARM public version architecture, Phytium has incorporated many improvements and has a significant advantage in power consumption compared to the ARM public version. Huawei’s Hi1612, which uses A57, still falls short of Phytium 2000 in terms of performance-to-power ratio, even with a 2-generation lead in manufacturing process.
In terms of multi-core performance, Phytium 1500A-16 is comparable to the 16-core A57 chips, while Phytium 2000 is on par with Intel E5 series chips.
Shenwei 1621 is currently Shenwei’s best server CPU, with a SPEC2006 score slightly better than Phytium 1500A-16. If sorting several chips by performance, the order from highest to lowest would be Phytium 2000, Shenwei 1621, Phytium 1500A-16, Loongson 3B3000, Loongson 3B2000, and Loongson 3B1500.
(Phytium CPU complete products)
The reason why Loongson’s server CPUs lag behind is mainly due to the different development directions of several companies. Loongson’s main focus is on embedded systems and PCs, sustaining itself through embedded systems while enhancing single-core performance, as improving single-core performance is crucial since having more cores without sufficient performance is futile.
This situation leads to the fact that in server CPUs, although single-core performance is not inferior to Phytium and Shenwei, the performance is ultimately outmatched due to the difference in core count. This situation is unlikely to change until the release of Loongson 3C5000. Phytium and Shenwei, on the other hand, focus their main energy on supercomputing chips and high-performance servers, naturally leading to a temporary advantage over Loongson in server CPUs.
Loongson, Shenwei, and Phytium can already meet the basic office needs of party and government.
Currently, in some party and government pilot programs, Loongson, Shenwei, and Phytium have achieved good results. For example, in the pilot in Yunfu City, Guangdong, more than 2000 Loongson terminals were deployed. After the pilot, user feedback indicated that except for one rare Kyocera multifunction printer that could not scan normally on domestic terminals, and the electronic seal’s key not being compatible causing substantive impacts, other staff reported that using domestic terminals for daily office work was not problematic, though they needed to adapt to some operational habits.
Loongson application scenarios
At the time, the then inspector of the Guangdong Provincial Economic and Information Commission, Deputy Secretary of the Party Group Zou Sheng, after trying the new Loongson 3A2000, communicated with Jiang Shan, the general manager of Loongson Zhongke Technology Co., Ltd. in South China, saying that the 3A2000 had made great progress, while the first phase pilot used the 3A1000 which was not mature. Using the 3A1000 computer, he needed to wait until he returned from getting a cup of water to check if it had completed logging in, and when he first opened the browser, it took a few “tick-tock” sounds to enter the page. Although it was quite strained, he insisted on using it for the sake of domestic development. Now, the 3A2000 is completely different; although it still has a gap compared to x86, he feels that it is at least acceptable for office work, and as long as optimization and upgrades continue, it can meet the needs of office business.
In April 2017, three Loongson 3A3000 were delivered to the 12345 citizen hotline service platform in Yunfu City. During the subsequent Eternal Blue incident, all X86 machines were infected with ransomware and could not work, but fortunately, the three Loongson terminals were unaffected, allowing the platform to smoothly overcome the difficulties and ensuring the uninterrupted operation of the 12345 service.
For domestic CPUs to replace Intel CPUs, they must be driven by application. Making party and government agencies use them first, only through practical applications can issues be continuously discovered and resolved. Through continuous trial and error and promotion, the performance and experience of domestic CPUs and OS can be improved. The inclusion of Loongson, Shenwei, and Phytium CPUs in the central government procurement list is a favorable policy to promote the development of domestic CPUs.
Leave a Comment
Your email address will not be published. Required fields are marked *