Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Unbeknownst to many, three of the four major SoC platforms in the Android camp for 2017, namely the Helio X30, Kirin 960, and Snapdragon 835, have all been released, while the Samsung Exynos platform is still on its way. The Kirin 960, as always, was first launched with the Huawei Mate 9, becoming the first mass-produced model among the four major SoC platforms for the coming year. In recent years, Huawei has successfully avoided direct competition with Apple, Qualcomm, and Samsung by adjusting the launch time of its flagship SoC platform to the end of the year, allowing it to seize the initiative months before the new architecture and process SoCs from Qualcomm and Samsung are introduced. The successful sales of the Huawei Mate 8 and Huawei Mate 9 over the past two years are the best examples of this strategy. Since the recently released Snapdragon 835 has not disclosed detailed technical parameters, and the new products from the Samsung Exynos platform are still on the way, this article will temporarily use the Snapdragon 821 and Exynos 8890 for a horizontal comparison of parameters, with a summary of the Snapdragon 835 and Exynos new platform news at the end.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

It must be said that MTK still knows how to draw blueprints, and their PPT skills are impressive. The Helio P20 has not yet seen specific models enter mass production, yet the configuration list for the Helio X30 has already been beautifully planned. With a main frequency of 2.8GHz, it essentially crushes its competitors, provided that TSMC’s 10nm process performs well. The network standards, RAM, and ROM configurations have been further enhanced compared to the Helio P20, fully demonstrating the status of a flagship SoC.

The Kirin 960 utilizes both Cortex-A73 and Mali-G71 simultaneously, which is undoubtedly a strong boost for Huawei fans. The Kirin 950 is compatible with LPDDR4 RAM but does not support UFS 2.0 ROM, while the Kirin 960 has successfully addressed this shortcoming. Additionally, the LTE Cat.12 modem, which has been available since 2015, has finally been integrated into the Kirin 960. Remember when Huawei executives mentioned that 2K screens and LTE Cat.12 modems had no practical significance for the Chinese market? In 2016, they finally followed public opinion by first using a 2K screen on the Honor V8 and then integrating the LTE Cat.12 modem into the Kirin 960.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Compared to the Snapdragon 821, which has made further performance improvements over the Snapdragon 820, the upgrades from Snapdragon 652 to 653 and from Snapdragon 625 to 626 are not as striking as the leap from Snapdragon 810 to 820. The most noticeable change is the further increase in CPU and GPU clock speeds, especially with the small cores now reaching a new height of 2GHz. Next, let’s focus on the Helio X30 and Kirin 960.

Helio X30

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Compared to the Helio X20/X25, the Helio X30 still adopts a 10-core structure, but replaces the two large cores using the Cortex-A72 architecture with Cortex-A73 architecture, while replacing four of the Cortex-A53 small cores with Cortex-A35. Of course, it also upgrades from the relatively high leakage 20nm process to 10nm. Next, let’s focus on the two new ARM architectures, Cortex-A73 and Cortex-A35.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The Cortex-A73 still uses the ARMv8 instruction set and 64-bit processor architecture. With this architecture, a single cluster can accommodate four Cortex-A73 cores, and if more cores are needed, multiple clusters must be connected using the CCI-550. The Cortex-A73 can form a heterogeneous architecture with Cortex-A53 and Cortex-A35. According to official statements, as an upgrade to the Cortex-A72, the Cortex-A73 can reach a maximum frequency of 2.8GHz, with theoretical peak performance and sustained performance improved by up to 30%, meaning it can extend the time the processor remains in a high-performance state.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

In addition to the Helio X30, the Kirin 960 also introduces the Cortex-A73 architecture. According to ARM, Samsung and Marvell have also signed technology licenses for the Cortex-A73. In recent years, the big.Little combination has helped ARM solve the problems faced by many smartphone manufacturers with 8-core and even 10-core processors. However, those models using the standard ARM architecture, particularly flagship models, have had issues with heat generation and high power consumption, primarily due to the large cores. Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Whether it’s the combination of Cortex-A15 and Cortex-A7 or Cortex-A57 and Cortex-A53, the performance of the large cores has been dismal. Remember the Exynos 5410 (A15+A7), which, despite using a dual 4-core design and limiting the time four large cores could be active, still could not escape the issues of power consumption and heat generation. Many users of the Samsung S4 and Meizu MX3 can attest to this; the 28nm process simply could not handle the Cortex-A15 architecture.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The later Snapdragon 810 (A57+A53) is no exception; the Xiaomi Note high-end version, Moto X Extreme, and nubia Z9 Max elite version, all of the Snapdragon 810 models I have experienced can be described as “burning hot”. Even the Snapdragon 808, which removed two large cores, still faced issues with the Xiaomi 4S, Lenovo Lemon X3, and Smartisan T2, indicating that the Cortex-A57 could not be tamed at 20nm. Faced with the unsatisfactory performance of the Cortex-A15, ARM provided the Cortex-A12 and Cortex-A17 as alternatives for downstream manufacturers, which offered slightly better efficiency compared to the Cortex-A15. The MT6595 is a classic example, using the Cortex-A17 and Cortex-A7 architecture. The MT6595’s classic representative is the Meizu MX4, but the PowerVR G6200 GPU severely dragged down the overall SoC performance. Even the MT6795T (Helio X10 Turbo) on the Meizu MX5 still used the subpar GPU.

To complement the Cortex-A57, the excellent Cortex-A72 was introduced, which significantly improved efficiency. The Cortex-A57 still struggled with heat and power consumption at 20nm, while the Cortex-A72 managed to balance performance, battery life, and heat at the 28nm node. The Snapdragon 652 and Snapdragon 650 are typical representatives. The failure of the Helio X25/X20 was not due to the Cortex-A72 but rather the high leakage rate of TSMC’s 20nm process. It is reported that the Cortex-A72 has a higher single-core efficiency compared to Qualcomm’s Kryo CPU in the Snapdragon 820. Unfortunately, due to some well-known reasons, the combination of 14nm and Cortex-A72 has not yet appeared on Qualcomm’s platform, while the Exynos 8890, also at 14nm, has opted for its self-developed Mongoose core for its large cores, missing out on the Cortex-A72 architecture. Now, both the Helio X30 and Kirin 960 have chosen to upgrade from Cortex-A72 to Cortex-A73, indicating that the high efficiency crown will continue to be passed down. It is rumored that Qualcomm and Samsung’s flagship SoC platforms next year will still use their own architectures, so the Cortex-A73 is likely to only appear in non-flagship SoCs, such as the Snapdragon 600 series. Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The Cortex-A35 is the 64-bit version of the Cortex-A7. According to ARM’s official PPT, the previous Cortex-A53 was merely an upgrade of the Cortex-A9. ARM is very confident in the efficiency of the Cortex-A35, and we can see some clues from its architectural design. The Cortex-A35, like the Cortex-A7, uses a superscalar dual-issue architecture with an 8-stage pipeline. This upgrade mainly improves efficiency through enhancements in individual blocks. Downstream manufacturers can not only customize the number of cores but also choose whether to include NEON (high-performance multimedia engine), Crypto (encryption), and ACP (accelerator consistency port) modules, and can even opt to eliminate the secondary cache if necessary. It can be said that the Cortex-A35’s approach benefits not only smartphone platforms but also extends to wearable devices and embedded platforms, as well as the IoT field. The Cortex-A32 architecture, originally developed for this field, will also face competitive pressure.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The Cortex-A35 has redesigned the instruction prefetch unit, boasting stronger branch prediction capabilities. It adopts many of the cache structures from the Cortex-A53, with the first-level cache serving as both instruction and data cache, and incorporates multithreading data prediction and write-back capabilities. Improvements have also been made to the NEON/FP pipeline storage performance, along with the introduction of double-precision multiplication calculations.

In terms of power consumption, the Cortex-A35 introduces state retention features for the CPU and NEON in power management, along with independent hardware control for the CPU to enter and exit the retention state. The Cortex-A35 aims to keep power consumption below 125mW. Reducing chip area can also lower power consumption; by appropriately trimming the Cortex-A35 core, the chip area can be controlled to below 0.4mm² at the 28nm process.

Through the aforementioned architectural adjustments, the Cortex-A35 can reduce power consumption by 10% compared to the Cortex-A7, while improving performance by 6-40%. The chip area of the Cortex-A35 is only 75% that of the Cortex-A53, and its power consumption is only 68%. This is why the Helio X30 combines Cortex-A73, Cortex-A53, and Cortex-A35 into a big.Little configuration. With the support of the 10nm process and the new Cortex-A73 and Cortex-A35, it is expected that we will not see a repeat of the Helio X20/X25 scenario where one core struggles while nine cores watch. The Helio X30 finally introduces support for LPDDR4 RAM and UFS 2.1 ROM, and returns to the Imagination PowerVR platform for its GPU, hoping that the performance of the 7 series will show significant improvement compared to the PowerVR G6200 from previous years, as consumers have long memories.

Kirin 960

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

In contrast, the Kirin 960 adopts the ARM standard architecture and does not replace the GPU module like the Helio X30. The Mali-G71 has a higher efficiency compared to the Mali-T880, which is why we expect the Kirin 960 to perform well in graphics processing and large games. For a long time, Imagination has equipped top-tier PowerVR series GPU cores for iPhones and iPads, leading consumers to subconsciously believe that ARM’s standard architecture GPUs are inferior to the PowerVR series. However, as seen from the actual performance of the Mali-T880 MP12 on the Samsung S7, as long as smartphone manufacturers optimize the standard design appropriately, ARM’s GPUs can still be quite competitive. The Mali-G71 GPU introduced by ARM is based on the new Bifrost architecture, abandoning the long-standing Midgard architecture. The prefix letter ‘T’ in the previous generations has been changed to ‘G’ in this generation, indicating ARM’s desire for consumers to clearly distinguish between the two architectures.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The Mali-G71 considers the future demands of high-end smartphones for VR (virtual reality), AR (augmented reality), and 3D gaming, supporting advanced APIs such as Vulkan and OpenCL 2.0. The architecture employs Claused Shaders technology, redesigning execution units to allow temporary computation results to bypass registers. This design not only reduces the storage pressure on register files and lowers power consumption but also further shrinks core area. Bifrost introduces a technology called Quad based vectorization, allowing four threads to execute simultaneously while sharing control logic, greatly improving execution unit utilization. If the Samsung S8 still uses ARM’s GPU, it will be delighted, as Samsung has equipped 12 cores on the Mali-T880 (ARM claims the upper limit is 16 cores). The Mali-G71 can provide up to 32-core expansion capability, undoubtedly offering superior graphics processing power compared to the Mali-T880.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Regarding the efficiency that consumers care about, it is reported that the Mali-G71 can improve efficiency by 20% compared to the Mali-T880, and in high-performance mode, it can achieve longer running times, meaning better stability. The Kirin 960 will utilize the CCI-550 released last year to interconnect the Cortex-A73 cores and Mali-G71 GPU structures. The biggest change of the CCI-550 compared to the CCI-400 is the introduction of a snoop filter, replacing the broadcast design in the CCI-400. This bus can communicate with all cores and caches, thereby reducing communication latency, enhancing scalability, lowering power consumption, and improving performance.

Preview of Snapdragon 835 and Exynos 8895 (tentative name)

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

The Snapdragon 835 was finally released recently. Although detailed specifications were not disclosed, media materials indicate that it uses Samsung’s 10nm FinFET process, which reduces chip size by 30% compared to the 14nm process, increases computing speed by 27%, and lowers power consumption by 40%. It supports the latest generation of Qualcomm Quick Charge 4.0 fast charging technology, which improves charging speed by 20% and charging efficiency by 30%, allowing up to 50% charge in just 15 minutes. It features a better power management system that effectively reduces battery loss caused by charging. Additionally, the Snapdragon 835 is rumored to adopt the Kryo 200 architecture and an 8-core design, with an integrated Adreno 540 GPU, supporting up to 8GB LPDDR4X RAM and incorporating the Snapdragon X16 LTE modem.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Samsung Electronics announced earlier that it has begun mass production of logic chips using the 10nm FinFET process, finally surpassing Intel to become the first company in the industry to adopt 10nm technology on a large scale. Intel’s 14nm process has been used in three generations of architectures: Broadwell, Skylake, and Kaby Lake, and the Tick-Tock law has clearly slowed down. Samsung seems to have entered a realm of its own.

TMSC previously stated that chips manufactured at the 10nm process will have a 50% smaller area than those at the current 16nm, with a 50% performance increase and a 40% reduction in power consumption. Samsung’s 10nm process reduces chip area by 30% compared to its 14nm process, with a 27% performance increase and a 40% reduction in power consumption. Dramatically, shortly after TSMC announced the mass production of 10nm, it revealed that its 7nm mobile chip is already in testing and has received certification from Synopsys, with trial production expected in the second quarter of next year. The 3nm process is also in preliminary research stages. It must be said that TSMC’s blueprint planning ability is on par with MediaTek; they need to find a way to mass-produce the 10nm and Helio X30 on time, and the Helio P20 new model has already been delayed multiple times.

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

Can the Kirin 960, Launched with the Mate 9, Outperform Other Processors?

As for the Samsung Exynos platform, the Exynos 8895 (tentative name) is expected to debut alongside the Snapdragon 835. Rumors suggest a significant increase in clock speed, with the big.Little combination still likely using the Mongoose core for the large cores, and the GPU possibly adopting the Mali-G71 like the Kirin 960. It remains to be seen whether it will continue the previous trend of pushing the standard architecture to 12 cores as seen with the Mali-T880. According to previous leaks from @i冰宇宙 on Weibo, the early engineering version of the Exynos 8895 achieved single-thread scores of 2301 and multi-thread scores of 7019, with a maximum clock speed expected to reach 3GHz, and CPU full-load power consumption below 5W. Although the single-thread performance still lags behind Apple’s A10, the Exynos 8895’s multi-thread score has finally surpassed the Apple A10. Additionally, to accommodate the rumored dual-camera design for Samsung’s new flagship next year, image processing performance is expected to improve by 70%-80%.

Conclusion: From the scores available on various benchmarking databases, the Kirin 960 on the Huawei Mate 9 has finally matched the overall performance of the Snapdragon 820 and Snapdragon 821, but still falls short of the Apple A10. In fact, compared to the Kirin 950/955, there has been significant progress, and the mass production models of MTK’s Helio P20 and Helio X30 are still a long way off. In contrast, Huawei’s efforts in developing its own SoCs over the years are evident. As for the SoC that can truly challenge the dominance of the Apple A10, we will have to wait for the Snapdragon 835 and the new Exynos platform, which are expected to debut around MWC 2017. Based on the current progress, it should not be difficult for Qualcomm and Samsung’s top SoCs to surpass Apple in multi-threaded scores, but the real challenge remains in single-thread computing capabilities. Keep pushing!

↓ Click “Read Original” for more exciting original content

Leave a Comment