Source | Programmer cxuan
Author | cxuan
Introduction
The C language is an abstract and procedural language that is widely used in low-level development. C plays an irreplaceable role in computer architecture and can be said to be the foundation of programming. In other words, regardless of which language you learn, C should be placed at the top of your learning list. The following image better illustrates the importance of C language.

As we can see, C is a low-level language, a system-level language. Operating systems are written in C, such as Windows, Linux, and UNIX. If other languages are the glamorous exterior, then C is the soul, always so plain and unadorned.
Features of C Language
So, since C language is so important, what is worth learning about it? We should not learn it just because it is important; we care more about what we can learn from it and what we can gain.
Design of C Language
C language was designed in 1972 by Dennis Ritchie and Ken Thompson at Bell Labs while developing the UNIX operating system. C is a popular language that perfectly integrates computer science theory and engineering practice, allowing users to achieve modular programming and design.
Computer science theory: referred to as CS, is a discipline that systematically studies the theoretical foundations of information and computation and the practical techniques for their implementation and application in computer systems.
Efficiency of C Language
C is an efficient language designed to fully leverage the advantages of computers, which is why C programs run very quickly and can use memory reasonably to achieve maximum execution speed.
Portability of C Language
C is a portable language, meaning that a C program written on one computer can easily run on another, greatly reducing the workload of program porting.
Characteristics of C Language
- C is a concise language; because C is designed closer to the low level, it does not require many features that high-level languages like Java and C# have, and the requirements for writing programs are not very strict.
- C has structured control statements; it is a structured language that provides control statements with structured characteristics, such as for loops, if…else statements, and switch statements.
- C has rich data types, including traditional character, integer, floating-point, array data types, as well as other data types not found in other programming languages, such as pointers.
- C can directly read and write memory addresses, thus achieving the main functions of assembly language and directly manipulating hardware.
- C is fast, and the generated target code has high execution efficiency.
Now let’s illustrate C language with a simple example.
Beginner C Language Program
Let’s look at a very simple C program. I am using a Mac, so I am developing with xcode. I think the tool doesn’t matter; everyone should use what they are comfortable with.
The first C program:
#include <stdio.h>
int main(int argc, const char * argv[]) {
printf("Hello, World!\n");
printf("My Name is cxuan \n");
return 0;
}
You may not know what this code means, but don’t worry; let’s run it and see the result.

This program outputs Hello, World! and My Name is cxuan. The last line is the execution result of the program, indicating whether there are errors. Now let’s explain the meaning of each line of code.
First, the first line #include <stdio.h> includes another file. This line tells the compiler to include the contents of stdio.h in the current program. stdio.h is a standard part of the C compiler software package that provides keyboard input and display output.
What is the C standard software package? C is a general-purpose, procedural, imperative programming language developed by Dennis M in 1972. The C standard library is a set of built-in functions, constants, and header files for the C language, such as <stdio.h>, <stdlib.h>, <math.h>, etc. This library serves as a reference manual for C programmers.
We will introduce stdio.h later; for now, just know what it is.
Below stdio.h is the main function.
A C program can contain one or more functions; functions are fundamental to C language, just as methods are the basic components of Java. main() indicates a function name, and int indicates that the main function returns an integer. void indicates that main() does not take any parameters. We will explain these in detail later; just remember that int and void are part of the standard ANSI C definition of main() (if using a compiler prior to ANSI C, please ignore void).
Then there is /* A simple C language program */, which indicates a comment. Comments are indicated by /**/, and the content between these symbols improves the readability of the program.
Note: Comments are only for helping programmers understand the meaning of the code; the compiler will ignore comments.
Next is {, which indicates the start of the function body, while the last right brace } indicates the end of the function body. The code written between { } is called the code block.
int number indicates that a variable named number will be used, and number is of int integer type.
number = 11 indicates that the value 11 is assigned to the variable number.
printf(Hello, world!\n); indicates a function call; this statement uses the printf() function to display Hello, world! on the screen. The printf() function is one of the standard library functions in C that outputs the result of the program to the display. The code \n indicates a newline, which moves the cursor to the next line.
The next line of printf() is the same as the previous one, so we won’t elaborate. The last line of printf() is interesting; you will find a %d syntax, which indicates that it uses integer output strings.
The last line of the code block is return 0, which can be seen as the end of the main function, and the last line is the code block }, which indicates the end of the program.
Alright, we have now completed our first C language program. Have you gained a deeper understanding of C? Probably not… this is just the beginning; keep learning.
Now, we can summarize the several components of a C language program, as shown in the following image:

Execution Flow of C Language
The reason C language is considered a high-level language is that it can read and understand human thoughts. However, in order for the hello.c program to run in the system, each C statement must be converted by other programs into a series of low-level machine language instructions. These instructions are packaged as executable object programs and stored in binary disk files. The target program is also known as the executable target file.
In UNIX systems, the conversion from source file to object file is performed by the compiler.
gcc -o hello hello.c
The gcc compiler driver reads the source file hello.c and translates it into an executable file hello. This translation process can be illustrated as follows:

This is the complete execution process of the hello world program, which involves several core components: preprocessor, compiler, assembler, linker. Let’s break them down one by one.
-
Preprocessing phase: The preprocessor modifies the source C program based on the starting#character. The #include <stdio.h> command tells the preprocessor to read the contents of the system header filestdio.hand insert it into the program as text. This results in another C programhello.i, which typically ends with.i. -
Next is the
Compilation phase: The compiler translates the text filehello.iinto the texthello.s, which includes a segment ofassembly language program. -
After compilation is the
Assembly phase: In this step, theassembler astranslates hello.s into machine instructions, packaging these instructions into arelocatable object programstored in the hello.o file. It contains 17 bytes of instruction encoding for the main function; if we open hello.o in a text editor, we will see a bunch of garbled characters. -
The last is the
Linking phase: Our hello program calls theprintffunction, which is part of the C standard library provided by the C compiler. The printf function is located in a file calledprintf.o, which is a separate precompiled target file that must be linked with our hello.o. Thelinker(ld)handles this merging operation. The result is the hello file, which is an executable target file (or executable file) ready to be loaded into memory and executed by the system.
You Need to Understand What the Compilation System Does
For a simple hello program like the one above, we can rely on the compilation system to provide correct and effective machine code. However, for the programmers we mentioned above, there are several features of the compiler you need to know:
Optimizing program performance: Modern compilers are efficient tools for generating good code. For programmers, you do not need to understand what the compiler does internally to write high-quality code. However, to write efficient C language programs, we need to understand some basic machine code and the process by which the compiler translates different C statements into machine code.Understanding link-time errors: In our experience, many complex errors are often caused by the linking phase, especially when you want to build large software projects.Avoiding security holes: In recent years,buffer overflow vulnerabilitieshave been the culprits behind network and Internet service issues, so it is necessary to avoid such problems.
System Hardware Composition
To understand what happens when the hello program runs, we first need to have an understanding of the system hardware. Below is a model of Intel system products, which we will explain.

Buses: The entire system runs on a collection of electrical pipelines called buses, which transmit byte information back and forth between components. Buses are typically designed to transmit fixed-length byte blocks, known aswords. The number of bytes in a word (word length) is a fundamental system parameter and varies across systems. Most words today are 4 bytes (32 bits) or 8 bytes (64 bits).

-
I/O Devices: Input/Output devices are the connection between the system and the external world. The image above shows four types of I/O devices: a keyboard and mouse for user input, a monitor for user output, and a disk drive for long-term data and program storage. Initially, executable programs are stored on the disk.Each I/O device connected to the I/O bus is called a
controlleroradapter. The main difference between controllers and adapters lies in their packaging. A controller is a chip on the I/O device itself or on the main printed circuit board of the system (commonly referred to as the motherboard). An adapter is a card inserted into a slot on the motherboard. Regardless of the organizational form, their ultimate purpose is to exchange information with each other. -
Main Memory: Main memory is atemporary storage device, not a permanent storage device; disks arepermanent storagedevices. Main memory stores both programs and the data being processed by the processor. Physically, main memory consists of a collection ofDRAM (dynamic random access memory). Logically, memory is a linear array of bytes, each with a unique address number starting from 0. Generally, each machine instruction that makes up a program consists of a different number of bytes, and the size of the data items corresponding to C program variables varies by type. For example, on Linux x86-64 machines, short type data requires 2 bytes, int and float require 4 bytes, while long and double require 8 bytes. -
Processor: TheCPU (central processing unit)or simply the processor is the engine that interprets (and executes) the instructions stored in main memory. The core size of the processor is a storage device (or register) of one word, called theprogram counter (PC). At any moment, the PC points to a machine language instruction in main memory (i.e., the address containing that instruction).From the moment the system is powered on until it is powered off, the processor continuously executes the instructions pointed to by the program counter, updating the program counter to point to the next instruction. The processor operates according to the instruction model defined by its instruction set architecture. In this model, instructions are executed in strict order, and executing an instruction involves a series of steps. The processor reads instructions from memory pointed to by the program counter, interprets the bits in the instruction, performs some simple operations indicated by that instruction, and then updates the program counter to point to the next instruction. Instructions may be consecutive or non-consecutive (for example, the jmp instruction does not read in order).
Here are several steps the CPU might take to perform simple operations:
-
Load: Copy a byte or word from main memory into a register, overwriting the previous contents of the register. -
Store: Copy a byte or word from the register to a specific location in main memory, overwriting the previous contents of that location. -
Operate: Copy the contents of two registers to theALU (Arithmetic Logic Unit). Perform arithmetic operations on the two words and store the result in a register, overwriting the previous contents of the register. Jump: Extract a word from the instruction and copy it into theprogram counter (PC), overwriting the original value.void main()declares a constructor method with uncertain parameters.int main(int argc, char* argv[]) {}, whereargcis a non-negative value representing the number of parameters passed from the environment running the program to the program. It is a pointer to the first element of an array of pointers to argc + 1, where the last one is null, and the previous one (if any) points to the string representing the parameters passed from the host environment to the program. Ifargv[0]is not a null pointer (or equivalently, ifargc > 0), it points to the string representing the program name; if the program name is not available in the host environment, this string is empty.
The Arithmetic Logic Unit (ALU) is a combination of digital electronic circuits that perform arithmetic and bitwise operations on binary numbers.
Analyzing the Execution Process of the Hello Program
Earlier, we briefly introduced the composition and operation of computer hardware. Now we will formally introduce what happens when running the example program, describing it from a macro perspective without delving into all the technical details.
Initially, the shell program executes its instructions and waits for the user to type a command. When we type the characters ./hello on the keyboard, the shell program reads the characters one by one into registers and then places them into memory, as shown in the following image:

When we press the Enter key, the shell program knows we have finished entering the command. The shell then executes a series of instructions to load the executable hello file, which copies the code and data from the target file from the disk to main memory.
Using DMA (Direct Memory Access) technology, data can be copied directly from the disk to memory, as follows:

Once the code and data from the hello target file are loaded into main memory, the processor begins executing the machine language instructions in the main program of hello. These instructions copy the bytes of the string hello, world!\n from main memory to the register file, and then from the register to the display device, ultimately displaying it on the screen, as shown below:

Cache is Key
We have just introduced the execution process of a hello program, and the system spends a lot of time moving information from one place to another. The machine instructions of the hello program are initially stored on disk. When the program is loaded, they are copied to main memory. When the CPU starts running, the instructions are copied from memory into the CPU. Similarly, the string data hello, world!\n is also initially on the disk; it is copied to memory and then to the display device for output. From a programmer’s perspective, this copying is mostly overhead, which slows down the program’s efficiency. Therefore, a major task for system design is to make programs run faster and faster.
Due to physical laws, larger storage devices are slower than smaller storage devices. As the processing efficiency between registers and memory diverges, system designers have adopted smaller and faster storage devices, known as cache memory, as temporary gathering areas to store information that may be needed soon. As shown in the following image:

In the image, we have marked the location of the cache. The L1 cache can hold tens of thousands of bytes and is almost as fast to access as the register file. The larger L2 cache is linked to the CPU via a special bus; although L2 cache is 5 times slower than L1 cache, it is still 5-10 times faster than memory. L1 and L2 are implemented using a hardware technology called static random access memory (SRAM). The latest and more powerful systems even have a third level of cache: L1, L2, and L3. The system can achieve a large amount of storage while also being faster to access due to the use of the cache’s locality principle.
Again: Details of the Beginner Program
Now, let’s explore the details of the beginner program, gradually understanding the features of C language.
#include<stdio.h>
As we mentioned earlier, #include<stdio.h> is content that needs to be processed before the program is compiled, known as preprocessing commands.
Preprocessing commands are processed before compilation. Preprocessing programs generally start with the # symbol.
All C compiler software packages provide the stdio.h file. This file contains input and output functions for the compiler, such as println() information. The file name means standard input/output header file. Typically, the information collection at the top of a C program is referred to as the header file.
The first standard of C was published by ANSI. Although this document was later adopted by the International Organization for Standardization (ISO) and the revised version published by ISO was also adopted by ANSI, the name ANSI C (rather than ISO C) is still widely used. Some software developers use ISO C, while others use Standard C.
C Standard Library
In addition to <stdio.h>, the C standard library also includes the following header files:

<assert.h>
Provides a keyword called assert that is used to validate assumptions made by the program and outputs diagnostic messages when assumptions are false.
<ctype.h>
The ctype.h header file of the C standard library provides functions that can be used to test and map characters.
These characters accept an int as a parameter, which must be EOF or an unsigned character.
EOF is a computer term that stands for End Of File, indicating that there is no more data to read from the data source in the operating system. The data source is usually referred to as a file or stream. This character typically exists at the end of text to indicate the end of data.
The C standard library’s errno.h header file defines an integer variable errno, which is set by system calls, indicating what error occurred.
The C standard library’s float.h header file contains a set of platform-dependent constants related to floating-point values.
limits.h header file determines various properties of various variable types. The macros defined in this header file limit the values of various variable types (such as char, int, and long).
locale.h header file defines locale-specific settings, such as date formats and currency symbols.
math.h header file defines various mathematical functions and a macro. All available functions in this library take a double type parameter and return a double type result.
setjmp.h header file defines the macro setjmp(), function longjmp(), and variable type jmp_buf, which bypasses the normal function call and return rules.
signal.h header file defines a variable type sig_atomic_t, two function calls, and some macros to handle different signals reported during program execution.
stdarg.h header file defines a variable type va_list and three macros that can be used to obtain parameters in functions when the number of parameters is unknown (i.e., variable number of parameters).
stddef.h header file defines various variable types and macros. Most of these definitions also appear in other header files.
stdlib.h header file defines four variable types, some macros, and various general utility functions.
string.h header file defines a variable type, a macro, and various functions for manipulating character arrays.
time.h header file defines four variable types, two macros, and various functions for manipulating dates and times.
main() Function
The main() function sounds like a mischievous child deliberately naming a method main to tell others that he is the center of the world, but in fact, the main() method is indeed the center of the world.
A C language program must start execution from the main() function; aside from the main() function, you can name other functions freely. Typically, the () after main indicates some incoming information; in our example, no information is passed because the input in parentheses is void.
In addition to the above format, there are two other representations of the main method: one is void main(){}, and the other is int main(int argc, char* argv[]) {}
Comments
In the program, comments are indicated by /**/. Comments have no practical use for the program but are very useful for programmers; they help us understand the program and allow others to understand the code you write. In our development work, we are very averse to people who do not write comments, which shows how important comments are.

The benefit of C language comments is that they can be placed anywhere, even on the same line as code. Longer comments can be expressed in multiple lines; we use /**/ for multi-line comments, while // only indicates single-line comments. Here are several forms of comments:
// This is a single-line comment
/* Multi-line comment expressed in one line */
/*
Multi-line comment expressed in multiple lines
Multi-line comment expressed in multiple lines
Multi-line comment expressed in multiple lines
Multi-line comment expressed in multiple lines
*/
Function Body
After the header file and the main method, the function body follows (comments generally do not count), which is where you write a lot of code.
Variable Declaration
In our beginner code, we declared a variable named number of type int. This line of code is called a declaration, which is one of the most important features of C language. This declaration accomplishes two things: it defines a variable named number and specifies the type of number.
int is a keyword in C language, indicating a basic data type in C language. Keywords are used for language definitions and cannot be used as variable names.
The number in the example is an identifier, which is the name of a variable, function, or other entity.
Variable Assignment
In the beginner example program, we declared a variable number and assigned it the value of 11. Assignment is one of the basic operations in C language. This line of code means assigning the value 11 to the variable number. When executing int number, the compiler reserves space for the variable number in computer memory, and when executing this assignment expression statement, it stores the value in the previously reserved location. The number can be assigned different values, which is why it is called a variable.

printf Function
In the beginner example program, there are three lines of printf(). This is a standard function in C language. The content in parentheses is the actual parameters passed from the main function to the printf function. Parameters are divided into two types: actual arguments and formal parameters. The content in the parentheses of the printf function we mentioned earlier are actual arguments.
return Statement
In the beginner example program, the return statement is the last statement. The int in int main(void) indicates that the main() function should return an integer. C functions with return values should have a return statement, and it is also recommended to keep the return keyword in programs without return values; this is a good habit or a unified coding style.
Semicolon
In C language, each line must end with a ;, indicating the end of a statement. If you forget or ignore the semicolon, the compiler will prompt an error.
Keywords
Below are the keywords in C language; there are a total of 32 keywords, categorized by their functions.
Data Type Keywords
The data type keywords mainly include 12, which are:
char: Declares character type variables or functionsdouble: Declares double precision variables or functionsfloat: Declares floating-point variables or functionsint: Declares integer variables or functionslong: Declares long integer variables or functionsshort: Declares short integer variables or functionssigned: Declares signed type variables or functions_Bool: Declares boolean type_Complex: Declares complex numbers_Imaginary: Declares imaginary numbersunsigned: Declares unsigned type variables or functionsvoid: Declares functions with no return value or no parameters, declares pointer with no type
Control Statement Keywords
Control statement loop keywords also include 12, which are:
Loop Statements
for: for loop, most commonly useddo: the premise condition loop body of the loop statementwhile: the loop condition of the loop statementbreak: exits the current loopcontinue: ends the current loop and starts the next round of the loop
Conditional Statements
if: the judgment condition of the conditional statementelse: the negation branch of the conditional statement, used with ifgoto: unconditional jump statement
Switch Statements
switch: used for switch statementscase: another branch of the switch statementdefault: other branches in the switch statement
Return Statements
return: subroutine return statement (can have parameters or not)
Storage Type Keywords
auto: declares automatic variables, generally not usedextern: declares variables that are declared in other files (can also be seen as referenced variables)register: declares register variablesstatic: declares static variables
Other Keywords
const: declares read-only variablessizeof: calculates the length of data typestypedef: used to give data types aliasesvolatile: indicates that a variable may be changed implicitly during program execution
Conclusion
In this article, we first introduced the features of C language, why C language is so popular, and its importance. Then we started with a beginner C language program, discussing the basic components of C language, how C language runs on hardware, the compilation process, and execution process, and finally deepened the explanation of the components of the beginner example program.