With little knowledge of hardware, let’s learn from Master Yang’s article “Do You Understand ARM CPU Architecture?!”.

Introduction

Recently, there was a requirement to install MySQL 8.0 on an ARM architecture; I’ve heard of ARM CPUs, but have never deployed them in practice; and what exactly is this ARM CPU architecture? I only have a vague idea of the name, but not much understanding. Therefore, today we focus on learning about it, and this article is the result.

As we all know, with the promotion of the Linux open-source operating system, many enterprise-level Linux systems are deployed on servers with x86 CPU architectures; this is common knowledge. However, if someone asks about ARM CPU architecture, many may not be able to explain it clearly. Today, let’s discuss ARM CPU architecture.

My understanding of x86 and ARM CPU architectures is as follows:

Generally, when people refer to Linux, they mean x86 Linux. ARM is a CPU architecture different from x86, with corresponding instruction sets that differ, thus requiring different software compilation environments. Software code is generally not interchangeable and typically requires compatibility porting.

x86 is a classic CISC instruction set, which is complex and feature-rich, executing instructions serially, meaning lower execution efficiency, but outstanding cost-performance, making it the mainstream processor instruction set for civilian terminals. Intel and AMD’s consumer processors are both based on the x86 instruction set, which represents the CISC instruction set.

To understand it thoroughly, we must first trace back to what a CPU is?

The CPU (Central Processing Unit) mainly consists of three parts: the arithmetic logic unit, the control unit, and registers.

The arithmetic logic unit performs computations, the control unit is responsible for issuing the information required for each instruction executed by the CPU, and the registers store temporary files/results of computations or instructions to ensure higher speed.

The CPU has four main functions: processing instructions, executing operations, controlling time, and processing data.

The instruction set is stored within the CPU and is a hard program that guides and optimizes the CPU’s computations. With these instruction sets, the CPU can run more efficiently.

Intel mainly has x86, EM64T, MMX, SSE, SSE2, SSE3, SSSE3 (Super SSE3),

SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX, and other instruction sets.

AMD mainly has x86, x86-64, and 3D-Now! instruction sets.

The strength of CPU instructions is an important indicator of CPU performance, and the instruction set is one of the most effective tools for improving microprocessor efficiency.

At present, mainstream architectures categorize instruction sets into Complex Instruction Set Computing (CISC) and Reduced Instruction Set Computing (RISC).

The central processing unit (CPU) can also be divided into CISC (Complex Instruction Set Computers) and RISC (Reduced Instruction Set Computers), which are the two mainstream CPU instruction set types.

CISC is represented by Intel and AMD’s x86 CPUs; RISC is represented by ARM and IBM Power.

To understand the ARM architecture, we first need to understand what CISC and RISC are?

Learning about databases, starting from the three design philosophies of share-everything/share-disk/share-nothing, is similar. To understand CISC and RISC, one must first understand the design philosophies of both.

1. CISC Design Philosophy

Early CPUs were all based on CISC architecture, and the design philosophy was to use the fewest and most refined machine language instructions to accomplish the required computational tasks. To facilitate software programming and improve program execution speed, hardware engineers continuously added instructions that could accomplish complex functions and various flexible addressing methods. Some instructions even support complex operations categorized from high-level language statements, but hardware design has also become increasingly complex and costly.

To achieve complex operations, microprocessors provide programmers with various registers and machine instruction functionalities. They also implement their powerful functions through microprograms stored in Read-Only Memory (ROM), executing a series of primitive instruction operations after analyzing each instruction to complete the required functions. This design form is known as Complex Instruction Set Computer (CISC) architecture. Generally, CISC computers contain at least 300 instructions, with some exceeding 500 instructions.

CISC architecture increases the complexity of CPU structure and the requirements for CPU technology, but it is very beneficial for compiler development.

2. RISC Design Philosophy

CPUs using the CISC complex instruction set have strong capabilities for processing high-level languages, which benefits computer performance. However, it has been found that the CISC instruction system is too complex to implement and may also reduce system performance. Long-term dedication to designing complex instruction systems is essentially designing a processor with an instruction system that is rarely usable in practice. At the same time, a complex instruction system inevitably brings structural complexity. This not only increases design time and costs but also easily leads to design errors. In actual computations, a typical program uses 80% of its instructions from only 20% of the processor’s instruction system. In fact, the most frequently used instructions are the simplest ones such as load, store, and add.

Following this line of thought, the idea of reducing instructions emerged: the instruction system should only include a small number of frequently used instruction sets and provide some necessary instructions to support operating systems and high-level languages.

The computers developed following this idea are called Reduced Instruction Set Computers (RISC).

In simple terms:

CISC completes complex instructions by manipulating memory, registers, and the arithmetic logic unit. In implementation, complex instructions are converted into a microprogram, which is stored in the microservice memory when manufacturing the CPU. A microprogram contains several micro-instructions (also called microcode), and executing a complex instruction is actually executing a microprogram.

RISC’s design intent addresses the complexities of CISC CPUs, selecting instructions that can be completed within a single CPU cycle to reduce CPU complexity, leaving the complexity to the compiler. The RISC architecture requires software to specify each operation step.

The RISC architecture can reduce CPU complexity and allow the production of more powerful CPUs at the same technology level, but it has higher requirements for compiler design.

At this point, one can naturally draw an intuitive conclusion:

Microprogram execution in CISC, as an atomic operation, cannot be interrupted;

RISC instructions, which can be completed in a single CPU cycle, can be interrupted, so theoretically RISC can respond to interrupts more quickly.

Understanding the design philosophy, let’s compare CISC and RISC from the hardware and software perspectives.

1. From the hardware perspective:

CISC handles variable-length instruction sets, which must be segmented, thus requiring more processing work when executing a single instruction.

On the other hand, RISC executes a fixed-length simplified instruction set, allowing for faster execution speeds and stable performance. Therefore, in terms of parallel processing, RISC is significantly superior to CISC, as RISC can execute multiple instructions simultaneously, splitting one instruction into several processes or threads to be executed by multiple processors. Since RISC executes a simplified instruction set, the manufacturing process is simpler and costs lower.

2. From the software perspective:

Due to its early development, CISC has a mature software ecosystem, with many software vendors supporting CISC-based PCs and services, such as doc/Microsoft applications.

RISC, on the other hand, has emerged later and is weaker. The application ecosystem is not as rich and diverse as CISC; given the long-term investment in manpower and resources in existing CISC applications that meet market demand, it is understandable that there is resistance to investing heavily in researching applications running on RISC.

Now, at this point, everyone can expand their thinking slightly and predict the focus of domestic CPU development:

Domestic CPU manufacturing technology is limited by lithography technology. To reduce CPU complexity, there may be a focus on RISC architecture; however, reducing the complexity of RISC architecture comes at the cost of increasing the complexity of compiler software and application ecosystem programs, resulting in a surge in development costs, which upstream application suppliers are reluctant to follow suit. But considering the current international tensions, the R&D of RISC architecture CPUs has significant implications in the fields of national defense and civilian life.

For enterprise servers, it’s true that old saying: RISC is cheaper to buy but expensive to use (code compatibility and lengthy development cycles), while CISC is expensive to buy but cheaper to use (microprogramming has been pre-integrated and optimized).

Now, let’s take a systematic look at the advantages and disadvantages of CISC and RISC:

CISC architecture instruction features:

1. Uses microcode. Instruction sets can be executed directly in the microcode memory (which is much faster than main memory). Newly designed processors can execute the same instruction set with only a few additional transistors, and new instruction set programs can be written quickly.

2. Large instruction set. This can reduce the number of code lines needed for programming, alleviating the burden on programmers. The instruction set corresponding to high-level languages includes double operand formats, register-to-register, register-to-memory, and memory-to-register instructions.

Advantages and disadvantages of CISC architecture:

1. Advantages: Effectively shortens the design time for new microcode instructions, allowing designers to achieve upward compatibility in CISC architecture machines. New systems can use a superset of instructions that includes those from earlier systems, thus allowing the same software used on earlier computers to be used. Additionally, the formats of microprogram instructions match high-level languages, so compilers do not necessarily have to be rewritten.

2. Disadvantages: The design of instruction sets and chips is more complex than that of previous generations of products. Different instructions require different clock cycles to complete, and slower executing instructions will affect the overall execution efficiency of the machine.

RISC architecture instruction features:

1. Simplified instruction set: Contains simple, basic instructions that can be combined to form complex instructions.

2. Same-length instructions: Each instruction has the same length, allowing it to be completed in a single operation.

3. Single machine cycle instructions: Most instructions can be completed in one machine cycle, allowing the processor to execute a series of instructions simultaneously.

Advantages and disadvantages of RISC architecture:

1. Advantages: With the same chip technology and operating clock, the operating speed of RISC systems will be 2 to 4 times that of CISC. Since the instruction set of RISC processors is simplified, their memory management units, floating-point units, etc., can all be designed on the same chip. RISC processors are simpler to design than their CISC counterparts, requiring less time to develop and allowing for the application of more advanced technologies, leading to faster next-generation processors.

2. Disadvantages: Multi-instruction operations require program developers to carefully choose the appropriate compiler, and the amount of code written can become very large. Additionally, RISC architecture processors require faster memory, which is typically integrated within the processor, i.e., L1 Cache.

In summary, to further compare the differences between CISC and RISC, one can analyze the following points:

1. Instruction formation: CISC uses micro-instruction code control unit design due to the complexity of instructions, while 90% of RISC instructions are completed directly by hardware, with only 10% completed by software in a combined manner. Therefore, the execution time of RISC is shorter, but RISC requires a relatively larger ROM space, while RAM usage should be related to the application of the program.

2. Addressing modes: CISC requires more addressing modes, while RISC has only a few addressing modes. Thus, when calculating the effective address in memory, CISC occupies more bus cycles.

3. Instruction execution: CISC instruction formats vary in length, and the number of cycles required for execution is also inconsistent, while RISC architecture is the opposite, making it suitable for pipeline processing architecture design, thereby striving to achieve one instruction per cycle on average.

Clearly, in design, RISC is simpler than CISC, and due to the excessive execution steps of CISC, idle circuit units experience increased wait times, which is not conducive to parallel processing design. Thus, in terms of performance, RISC has an advantage over CISC, but the simplification of RISC instructions leads to larger application program code, requiring more storage space, and the variety of instructions limits the promotion of RISC.

As mentioned at the beginning of the article,

CISC is represented by Intel and AMD’s x86 CPUs,

while RISC is represented by ARM and IBM Power.

Next, let’s look at some specific application scenarios.

ARM architecture CPUs are based on RISC, characterized by fixed instruction lengths, high execution efficiency, low cost, and targeted at embedded platforms. It simplifies hardware logic design, reduces the number of transistors, thus lowering power consumption, and the control of pipelines is not complex, further reducing transistor count, primarily facing lightweight programs with clear goals, hence mainly used in mobile devices.

In contrast, x86 CPUs represented by Intel and AMD are based on CISC, with complex hardware logic design, parallel pipeline instruction sets, hyper-threading, virtualization, etc., with high complexity and a large number of transistors, primarily positioned in compute-intensive scenarios such as multimedia editing and scientific computing.

In simple terms:

ARM is for low power consumption, while x86 is for high performance.

Recent Hot Articles:

“Investigating Issues with SQLPlus Execution Errors”

“How to Check JVM Heap Memory Usage”

“How to Quickly Delete a Large Number of Small Files in Linux?”

“YNWA, a Motivation for Us Ordinary People”

“What Exactly is Under the Sea?”

“Several SQL Writing Methods for Deduplication”

“The Necessity of Creating Domestic Technology Products”

“Does SQL Query Always Execute the SELECT Statement First?”

“Do You Understand the Ways and Risks of Deleting Fields in Oracle?”

“The Strange Problem of Slow Login”

“The ^M Confusion in Linux”

“Smart Techniques for Oracle-Related Questions”

“An Old Article Offering Advice for Oracle Beginners”

“Several Potential Hazards in PLSQL Developer”

“Extracting 7,000 Words of Essence from a 700,000-Word SRE Masterpiece”

“A Thrilling Record of Data Loss to Full Recovery”

“What are the Differences and Connections between OpenJDK and Oracle JDK?”

“Classification and Indexing of 600 Articles on WeChat Official Account”

Understanding ARM CPU Architecture

Introduction

My understanding of x86 and ARM CPU architectures is as follows:

To understand it thoroughly, we must first trace back to what a CPU is?

The CPU (Central Processing Unit) mainly consists of three parts: the arithmetic logic unit, the control unit, and registers.

The arithmetic logic unit performs computations, the control unit is responsible for issuing the information required for each instruction executed by the CPU, and the registers store temporary files/results of computations or instructions to ensure higher speed.

The CPU has four main functions: processing instructions, executing operations, controlling time, and processing data.

The instruction set is stored within the CPU and is a hard program that guides and optimizes the CPU’s computations. With these instruction sets, the CPU can run more efficiently.

Intel mainly has x86, EM64T, MMX, SSE, SSE2, SSE3, SSSE3 (Super SSE3),

SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX, and other instruction sets.

AMD mainly has x86, x86-64, and 3D-Now! instruction sets.

The strength of CPU instructions is an important indicator of CPU performance, and the instruction set is one of the most effective tools for improving microprocessor efficiency.

Leave a Comment Cancel reply

Introduction

My understanding of x86 and ARM CPU architectures is as follows:

To understand it thoroughly, we must first trace back to what a CPU is?

The CPU (Central Processing Unit) mainly consists of three parts: the arithmetic logic unit, the control unit, and registers.

The arithmetic logic unit performs computations, the control unit is responsible for issuing the information required for each instruction executed by the CPU, and the registers store temporary files/results of computations or instructions to ensure higher speed.

The CPU has four main functions: processing instructions, executing operations, controlling time, and processing data.

The instruction set is stored within the CPU and is a hard program that guides and optimizes the CPU’s computations. With these instruction sets, the CPU can run more efficiently.

Intel mainly has x86, EM64T, MMX, SSE, SSE2, SSE3, SSSE3 (Super SSE3),

SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, VMX, and other instruction sets.

AMD mainly has x86, x86-64, and 3D-Now! instruction sets.

The strength of CPU instructions is an important indicator of CPU performance, and the instruction set is one of the most effective tools for improving microprocessor efficiency.

Related posts

Leave a Comment Cancel reply