A Deep Dive into C Language: From Basic Types to Memory Layout

The Data World of C Language: A Deep Dive from Basic Types to Memory Layout

I am Feri. In embedded development, the choice of data types directly affects memory usage and runtime efficiency. The power of C language comes from its precise control over dataโ€”this article will guide you through the surface of data to understand the underlying computer logic behind types.

1. Data Types: The “Language Rules” for Computers to Understand the World

The C language maps real-world information into binary space through a strict type system. Mastering data types is akin to mastering the grammar rules for conversing with computers.

1.1 Basic Data Types: The Building Blocks of Data Structures

๐Ÿ”ฅ Integer Family: Precise Representation of Values

Type Keyword Byte Size (32-bit) Value Range (Signed) Typical Use Cases
Short Integer <span>short</span> 2 -32768 ~ 32767 Counters, small data storage
Integer <span>int</span> 4 -2147483648 ~ 2147483647 General integer operations
Long Integer <span>long</span> 4/8* At least the same length as<span>int</span>, usually 8 bytes on 64-bit systems Large integer calculations (e.g., file sizes)
Long Long Integer <span>long long</span> 8 -9223372036854775808 ~ 9223372036854775807 High-precision numerical processing

โš ๏ธ Note: The C standard only specifies that<span>short <= int <= long <= long long</span>, the specific byte size is determined by the compiler (can be obtained using<span>sizeof()</span>).

๐Ÿงฎ Floating Point: Approximate Representation of Real Numbers

  • Single Precision<span>float</span> (4 bytes): Effective digits 6-7, suitable for graphical rendering coordinate calculations
  • Double Precision<span>double</span> (8 bytes): Effective digits 15-16, preferred for scientific calculations (e.g., physical formula derivation)
  • Long Double Precision<span>long double</span> (8/16 bytes): Ultra-high precision scenarios (e.g., financial calculations, cryptography)
float pi = 3.1415926;  // Actual storage precision only up to the 6th digit, subsequent digits may be distorted

double precisePi = 3.141592653589793;  // Double precision can retain 15 significant digits

๐Ÿ“– Character Type: Binary Encoding of Text

  • <span>char</span> is essentially a 1-byte integer, storing ASCII codes (0-127) or extended character sets (e.g., GBK’s -128~127)
  • Signed<span> signed char</span> (default, supports negative numbers) and<span> unsigned char</span> (0-255, used for byte operations)
char ch = 'A';        // Stores ASCII code 65 (decimal)

unsigned char byte = 0xFF;  // Represents 255, commonly used for network data transmission

1.2 Constructed Data Types: Assembly Solutions for Complex Data

๐Ÿงฑ Arrays: Continuous Memory Blocks of the Same Type

int scores[5] = {85, 90, 95};  // Defines an integer array of length 5, uninitialized elements are random values

char name[] = "Feri";  // Automatically calculates length as 5 (including the terminating \0)
  • Memory addresses are continuous, accessed via index<span>[index]</span><code><span> (index starts from 0)</span>
  • The array name is the address of the first element, which can be manipulated via pointers:<span>int *p = scores; p[0] == scores[0]</span>

๐Ÿงฉ Structures: Custom Data Containers

struct Student {  // Define a student structure
    char name[20];
    int age;
    float score;
};

struct Student tom = {"Tom", 18, 89.5};  // Initialize structure variable
  • Allows combinations of different types of data, memory allocated in the order of members
  • Access members via<span>.</span> (<span>tom.score</span>), pointer access uses<span>-></span> (<span>struct Student *p = &tom; p->age</span>)

๐Ÿ”„ Unions: The Magic of Memory Reuse

union Data {  // Union members share the same memory segment
    int num;
    char ch;
    float f;
};

union Data d;
d.num = 100;    // At this point, ch and f's values are meaningless
d.ch = 'A';     // At this point, num and f's values are overwritten
  • Memory size equals the size of the largest member (saves space at the cost of type safety)
  • Suitable for scenarios where only one type of data is used at a time (e.g., variable fields in protocol parsing)

๐Ÿ“Œ Enumerations: A Collection of Named Constants

enum Color { RED, GREEN, BLUE = 5 };  // Enumeration members start from 0 by default, BLUE is 5

enum Color favorite = GREEN;  // favorite's value is 1
  • Enhances code readability, avoids magic numbers (e.g., using RED instead of 0)
  • Essentially integers, can participate in arithmetic operations (not recommended to misuse)

1.3 Pointer Types: Manipulators of Memory Addresses

int num = 10;
int *ptr = &amp;num;  // Pointer ptr stores the address of num (e.g., 0x7ffd5f8e4a20)
*ptr = 20;       // Modify the value of num through dereferencing (num becomes 20)
  • Pointers are the soul of C language, enabling dynamic memory allocation (<span>malloc</span>), array operations, function parameter passing, and other core functionalities
  • Beware of dangling pointer risks: uninitialized pointers (<span>int *p; *p = 10;</span> will lead to undefined behavior)

1.4 Void Type: The Universal Type’s Transfer Station

  • No Return Value Functions:<span>void printHello() { printf("Hello\n"); }</span> (no need for<span>return</span> statement)
  • Generic Pointers:<span>void *buffer = malloc(1024);</span> (can point to any type, requires type casting when used)
  • No Parameter Declaration:<span>int main(void)</span> (explicitly declares no parameters, more in line with modern C standards)

2. Variables: Dynamic Containers for Data

2.1 Variable Lifecycle and Scope

๐ŸŒ Global Variables: Always Present During Program Execution

int globalVar = 10;  // Defined outside all functions
void func() {
    globalVar = 20;  // Correct, global variable scope extends from definition to end of file
}
  • Disadvantage: Breaks encapsulation, prone to race conditions in multi-threaded environments
  • Recommendation: Use only for data that must be shared (e.g., configuration parameters)

๐Ÿข Local Variables: Temporary Storage in Function Stack Frames

void calculate() {
    int localVar = 0;  // Visible only within the calculate function
    for (int i=0; i&lt;10; i++) {  // C99 supports variable definition within for loop (i's scope is limited to the loop body)
        localVar += i;
    }
}  // localVar and i are destroyed after function ends
  • Advantage: Clear scope, avoids naming conflicts
  • Note: Uninitialized local variables have random values (commonly known as “garbage values”), which may lead to logical errors

2.2 Variable Naming: The First Line of Defense for Code Readability

  • Rules:
  1. Composed of letters, numbers, and underscores, the first character cannot be a number
  2. Case-sensitive (<span>Count</span> and <span>count</span> are different variables)
  3. Keywords cannot be used (e.g.,<span>if</span>, <span>register</span>, <span>sizeof</span>)
  • Conventions:
    • Camel case:<span>studentAge</span>, <span>maxScore</span>
    • Hungarian notation (commonly used in embedded systems):<span>u8_t age</span> (u8_t represents unsigned 8-bit integer)
    • Avoid single-letter variables (except for loop indices<span>i</span>, <span>j</span>, etc.)

    3. Constants: The Immutable Foundation of Data

    3.1 Literal Constants: Fixed Values Written Directly

    ๐Ÿ“ Three Representations of Integers

    • Decimal:<span>123</span>
    • Octal (starts with 0):<span>0173</span> (equals decimal 123)
    • Hexadecimal (starts with 0x):<span>0x7B</span> (equals decimal 123)

    ๐ŸŒ The Essential Difference Between Strings and Characters

    char ch = 'A';       // 1 byte, stores ASCII code 65
    char str[] = "A";    // 2 bytes, stores 65 and the terminating \0 (ASCII code 0)
    
    • The string automatically adds a<span>\0</span> at the end, which is fundamental for C language text processing

    3.2 Symbolic Constants: Aliases that Give Meaning to Values

    โœจ <span>#define</span> Preprocessor Definitions

    #define PI 3.141592  // Macro definition, text replacement during preprocessing
    #define MAX_SIZE 100
    
    • Advantage: Facilitates unified modification (e.g., changing PI precision requires only one change)
    • Disadvantage: No type checking, may lead to macro expansion errors (recommended to use<span>const</span> instead)

    ๐Ÿ›ก๏ธ <span>const</span> Read-only Variables

    const int MAX_SCORE = 100;  // Define a constant, cannot be modified
    MAX_SCORE = 200;  // Compilation error!
    
    • Has type information, safer (C99 standard supports)
    • Scope follows variable rules (local/global constants)

    4. The Golden Rules for Choosing Data Types

    1. Principle of Sufficiency: Use<span>short</span> if possible, avoid using<span>int</span> to save limited RAM in embedded devices
    2. Precision Matching: Use<span>double</span> for financial calculations, use<span>float</span> for interface coordinates
    3. Avoid Implicit Conversions: Mixing<span>int</span> and<span>float</span> in operations may lose precision, use explicit type casting:<span>(float)age</span>
    4. Pointers are Addresses: Ensure pointers point to valid memory before operating on them (check if<span>malloc</span> returns<span>NULL</span>)

    Data types are the bridge for C language to communicate with hardware. When you define<span>char ch = 'F'</span>, the computer is inscribing your programming mark in memory with the binary number 65 (01000110). In the next article, we will delve into operators and expressions, learning how to use these data to construct complex logical structures. Follow me to unlock the underlying control power of C language!

    // Philosophical Reflection on Data Types:
    typedef struct {
        char name[20];
        int experience;
        void (*teach)(void);  // Function pointer, points to teaching method
    } Programmer;
    
    Programmer feri = {"Feri", 12, teachC};  // Define a programmer using structure
    

    Leave a Comment