How many cores does your phone have? This is one of the most common questions we ask when comparing two smartphones. The number of CPU cores is indeed an important indicator of a phone’s performance, but it is not the most accurate measure. At the end of last year, Qualcomm acknowledged that their octa-core chip was merely a marketing gimmick when they released the Snapdragon 820. Judging chip performance solely based on core count is almost meaningless. After the initial shock, if we say that the number of CPU cores does not directly correlate with phone performance, what then determines a phone’s performance?
The Child Who Spoke the Truth About the Emperor’s New Clothes
At the end of 2015, Qualcomm launched the Snapdragon 820, which had only four cores—two high-performance Kryo cores and two power-efficient Kryo cores—amidst the industry’s frenzy for mobile processor core count. The launch sparked significant controversy. When Qualcomm admitted that octa-core chips were merely a marketing gimmick, it undoubtedly caused a major upheaval in the consumer market. As data transfer rates significantly impact user experience, this has become a better way to test mobile performance. Of course, we hope Qualcomm can provide more data to convince consumers.
CPU is Important but Not Everything
The CPU (Central Processing Unit) is the most well-known parameter, and its importance can be inferred from its name. Many people equate it with the entire chip. In reality, the more accurate term for a mobile chip is System on Chip (SoC), of which the CPU is just one part, responsible for controlling calculations and directly determining the smoothness of the phone’s operation.
When assessing CPU performance, most consumers are misled by various shameless marketing tactics and salespeople with only a superficial understanding, focusing only on how many cores there are and the clock speed. However, these are just part of the final performance. The most decisive factor is architecture!
Currently, mainstream SoCs, regardless of their source, almost all adopt ARM’s Reduced Instruction Set Architecture. The difference lies in that capable manufacturers only use ARMv7/v8 instruction sets and will design their own architecture, such as Apple and Qualcomm. Since the A6, Apple has been designing its own CPU architecture, and the A7, with its Cyclone architecture, became the first 64-bit mobile processor, leading the industry by a year. Meanwhile, Qualcomm established its dominance in the 32-bit era with the Krait architecture based on the ARMv7 instruction set, and the latest Snapdragon 820 is built on ARMv8 with a new Kryo architecture.
Another option is to directly purchase and use ARM’s designed reference architecture. The commonly seen Cortex-A7/A8/A9/A53/A57/A72 are names of architectures designed by ARM. This includes the 32-bit ARMv7 instruction set and the 64-bit ARMv8 instruction set, with MediaTek and HiSilicon adopting the reference design.
To illustrate, the first option is like ARM laying the foundation, and manufacturers can freely choose how to build the house. The advantage is flexibility, and performance and power consumption are often better controlled than with reference architectures, but it requires higher demands in terms of money, time, and technology. The second option is akin to ARM not only laying the foundation but also providing the blueprints; manufacturers only need to follow the blueprints for construction, significantly reducing the time and cost of developing the entire SoC chip.
Different architectures directly determine the performance baseline. For example, processors commonly found in budget phones may have eight cores and a clock speed exceeding 2GHz, but due to their use of the low-power A53 architecture, their performance may be inferior to that of a dual-core high-performance A72 architecture, especially in single-threaded operations. This explains why Apple continues to choose dual-core processors while maintaining a significant performance lead. This is similar to computer processors, where Intel’s famous Tick-Tock strategy focuses on alternating improvements in architecture and process technology each year, ultimately achieving annual performance increases. Therefore, aside from architecture, the next most important aspect is the process technology.
Process technology refers to the distance between circuits within a chip. Currently, the mainstream processes are 28nm and 20nm, with the most advanced being 16/14nm. Advanced processes can reduce processor energy consumption and heat, shrink chip size, and enhance performance. Understanding this, we can see why the Snapdragon 810 with a 20nm process suffers from heat and power consumption issues, while the 14nm Samsung Exynos 7420 performs excellently.
To elaborate further, semiconductor process technology can also be divided into 2D structure MOSFET (Metal-Oxide-Semiconductor Field-Effect Transistor) and 3D structure FinFET (Fin Field-Effect Transistor). The FinFET architecture mainly modifies the gate controlling the current, significantly improving circuit control and reducing leakage current.
Even with the same process technology, iterations may occur due to technological advancements. For instance, last year’s controversy over the Apple A9’s foundry revealed that theoretically more advanced 14nm performance was inferior to TSMC’s 16nm. The reason is that Samsung jumped directly from 28nm to 14nm LPE technology, while TSMC transitioned step-by-step from 28-20-16nm, resulting in more mature technology, better yield, and leakage control. However, Samsung has clearly learned from its mistakes and recently launched the second generation of 14nm LPP technology, which improves performance by 15% and reduces power consumption by 15%. TSMC has also introduced an improved version of its 16nm FinFET+, so seemingly identical processes can differ due to specific versions.
After the above discussion, I believe everyone has a general understanding of how to judge CPU quality. In summary, architecture and process are more important than core count and clock speed, and high-performance architectures greatly outperform low-power architectures, while more advanced processes yield stronger processors.
GPU Determines Gaming Performance
A good CPU is just the first step in creating a high-end chip. While it ensures smooth and stable operation, in an era where mobile entertainment is increasingly important—especially among young people who prefer to use smartphones instead of handheld consoles for gaming—the GPU is undoubtedly a crucial component.
The GPU, or Graphics Processing Unit, relates to the strength of graphic rendering capabilities, directly determining whether games can run smoothly. Currently, the main mobile GPU manufacturers are ARM, Qualcomm, Imagination Technologies, and NVIDIA. Qualcomm’s Adreno and NVIDIA’s Maxwell GPUs are only used in their Snapdragon and Tegra chips, so only ARM Mali and Imagination PowerVR series are available for sale to others.
Although there are often rumors about Samsung and Apple developing their own GPUs, to date, all chip manufacturers can only purchase ARM or Imagination’s solutions to integrate into their chips. For instance, Huawei’s HiSilicon and Samsung’s Exynos use ARM Mali series, while Imagination PowerVR is commonly used in Apple’s A series and MediaTek chips.
Due to the different architectures adopted by different manufacturers, such as unified rendering and separate rendering, it’s not possible to judge the performance of mobile GPUs solely based on architecture or core count. The industry typically assesses performance by looking at triangle output rates and pixel fill rates. Of course, with the advent of various benchmarking software, comparisons have become more intuitive. If you find it hard to remember various rankings, understanding each manufacturer’s naming conventions will give you a rough idea.
Generally speaking, mobile GPU names are composed of letters and numbers, such as Adreno 530, Mali-T880, PowerVR GT7600, where the first number typically indicates the generation—the larger the number, the newer the generation; the second number represents positioning, with larger numbers indicating stronger performance. The three GPUs mentioned above are the latest high-end models already in commercial use. However, ARM’s Mali GPU is a bit special; in addition to the model, the number of cores (indicated by the last MPx, where x represents the number of cores) must also be considered. For example, although both the Kirin 950 and Exynos 8890 use the latest Mali-T880, the Kirin 950 has only 4 cores, while the Exynos 8890 has 12 cores, resulting in a significant performance difference.
Phone Calls Depend on Baseband Performance
The core function of a mobile phone is still communication. For this, even the most powerful CPU and GPU are helpless; it ultimately depends on the baseband.
The baseband directly determines what kind of network standards the phone supports. When we use our phones to make calls, browse the internet, or send messages, these actions are all carried out through the upper processing system issuing commands to the baseband, which processes and executes them. One could say that the baseband is the most technologically advanced part of the entire chip. Manufacturers like Texas Instruments, STMicroelectronics, and NVIDIA have been eliminated from the market due to the absence of integrated basebands in their SoCs or outdated technology.
The most intuitive way to judge the quality of a baseband is to look at the number of supported standards and frequency bands. A phone that claims to be fully compatible with all networks generally uses a high-end baseband, supporting all standards from 4G LTE to 3G WCDMA/CMDA/TD-SCDMA and 2G GSM/EDGE. Users only need to purchase the phone and insert their SIM card without worrying about compatibility or card-swapping issues.
In addition to standard support, another important feature of basebands in the 4G era is the UE access capability level of LTE, which indicates the transmission rates that UE can support, usually denoted as Cat x. The larger the x, the higher the upload and download speeds. Unfortunately, the Kirin 950 uses the older Balong 720 baseband, which only supports Cat 6, instead of the latest Balong 750 that supports Cat 12/13.
Multimedia Performance Depends on SoC Overall Performance
Multimedia performance itself is divided into several parts. First is the video encoder, which concerns how many formats the phone can ultimately support for encoding and decoding. For example, ARM’s Mali-V550 and Snapdragon 820 support 4K resolution H.265 HEVC format video encoding and decoding. Some SoCs also integrate audio decoders, like MediaTek.
Next is the display controller, which affects the maximum resolution and frame rate that the phone can support. Recently, ARM released the Mali-DP650, which can easily handle 4K resolution at 60FPS, suitable for both mobile devices and 4K TVs.
Finally, the image signal processor (ISP) is responsible for processing the data returned by the image sensor, influencing various basic and advanced imaging functions such as white balance, focusing, exposure, noise reduction, face recognition, and motion compensation. The most well-known solution in this field is Fujitsu’s Milbeaut, favored by Samsung and Smartisan. However, adding an ISP separately can increase power consumption, heat, and design complexity, so manufacturers prefer to integrate ISP technology into SoCs, as seen in Qualcomm, HiSilicon, and Apple’s chips.
Fun Reading
Why Mobile Processors Don’t Have Odd Number of Cores
It’s not just mobile processors; even computer processors typically have an even number of cores. Occasionally, some odd-core processors appear, but these are usually achieved by masking one core at the hardware level, forming a 3-core processor. The masked core is used for standby to reduce power consumption, which essentially makes it a 4-core processor.
From a technical perspective, chips are generally rectangular in structure. During manufacturing, they also need to be cut in both horizontal and vertical directions. Rectangles are naturally occurring geometric shapes produced by orthogonal line combinations. When circuits are scaled, the rectangular shape aids in finding regularities to control and modify circuits amid numerous electrical signals. When designing a processor, one typically draws one core, and through several mirroring processes, creates multiple cores, including control circuits and clock circuits that are entirely symmetrical. This leads to the core timing of multi-core processors being completely identical.
Moreover, since a processor is square-shaped, the cores are generally also square. For instance, cutting a square into equal halves or quarters is easy to coordinate, but cutting it into equal odd parts becomes challenging. This is one potential reason for the even number of cores. Of course, some native odd-core processors exist. This is also due to the consideration of balancing power consumption and performance. Additionally, chip manufacturers often package two cores together to form a module, leading to another reason why chips tend to have an even number of cores.
The second reason could be due to cost, market demand, performance, and other factors.