Cultivating Embedded C/C++ Programming Skills

Follow “Embedded Miscellany“, and choose “Star Official Account” to progress together!

Source: CSDN

What makes a good programmer? Is it knowing many technical details? Or understanding low-level programming? Or being fast at coding? I think none of these.

For some technical details and low-level technologies, you can find them in help documents or by researching. As for speed, practice makes perfect.

I believe a good programmer should possess the following qualities:

1. A spirit of inquiry, eager to learn and ask questions, able to make connections.

2. A positive attitude and creative thinking.

3. The ability to communicate and collaborate effectively, showing team spirit.

4. Humility and caution, avoiding arrogance and impatience.

5. High-quality code. This includes: stability, readability, standardization, maintainability, and professionalism.

These are the qualities of a programmer. Here, I want to discuss “programming quality,” which corresponds to the fifth point above. I think that if I want to understand an author, I will read their novels; if I want to understand a painter, I will look at their paintings; if I want to understand a worker, I will examine their products.

Similarly, if I want to understand a programmer, what I most want to see is their code. The program code can reveal a programmer’s quality and cultivation. A program is like a work of art; the works of a programmer with quality and cultivation are bound to be a beautiful painting, a wonderful song, or an enjoyable novel.

I have seen many programs with no comments, poor indentation, and poorly named variables, etc. I refer to such programs as lacking cultivation. Are these programmers doing creative work? No, they are essentially destructive. Rather than programming, they are “encrypting” the source code. Such programmers should be dismissed whenever they are encountered because the value created by their programs is far less than the maintenance cost required.

Programmers should have the cultivation of programmers. Even if they are tired and pressed for time, they must take responsibility for their programs. I would prefer a programmer who is slow and technically average but has a good programming style over a fast and technically strong programmer who is “destructive”.

There is a saying, “The character is reflected in the writing.” I believe that a programmer’s quality can also be seen in their code. A program is a programmer’s work, and the quality of the work directly relates to the programmer’s reputation and quality. A “cultivated” programmer is bound to produce good programs and software.

There is a saying, “Unique craftsmanship,” which means that whatever you do should be done professionally and with care. If you want to be a “craftsman,” a person of high skill, then you should be able to see whether you possess the characteristics of a “craftsman” from a simple piece of work. I think being a programmer is not difficult, but being a “craftsman” programmer is not simple. Writing programs is easy, but writing quality programs is challenging.

I won’t discuss overly complex technologies here; I just want to talk about some easily overlooked aspects. Although these aspects may seem minor, if you ignore them, they can greatly affect the overall quality of your software and its implementation. As the saying goes, “A thousand-mile dike is destroyed by an ant hole.”

“True skill is shown in the details.” The real foundation of a program is precisely in these details.

This is the programming quality of a programmer. I have summarized thirty-two “qualities” for writing programs in C/C++ (mainly C language). Through these, you can write high-quality programs, and those who read your code will surely praise it, saying, “This person’s programming quality is impressive.”

1. Copyright and Version

A good programmer will label each function and file with copyright and version information.

For C/C++ files, the file header should have comments like this:

/************************************************************************
*
* File Name: network.c
*
* File Description: Network communication function set
*
* Author: Hao Chen, February 3, 2003
*
* Version: 1.0
*
* Modification Records:
*
************************************************************************/

And for functions, there should also be comments like this:

/*================================================================
*
* Function Name: XXX
*
* Parameters:
*
* type name [IN] : description
*
* Function Description:
*
* ..............
*
* Return Value: TRUE on success, FALSE on failure
*
* Exceptions:
*
* Author: ChenHao 2003/4/2
*
================================================================*/

This kind of description gives a general understanding of a function or a file, greatly benefiting code readability and maintainability. This is the beginning of producing good works.

2. Indentation, Spaces, Line Breaks, Blank Lines, Alignment

i) Indentation is something every programmer should do. Everyone who has learned programming should know this, but I have still seen programs without indentation or with messy indentation. If your company has programmers who write code without indentation, please do not hesitate to dismiss them and sue them for damaging the source code, and demand compensation for the mental distress caused to those who read their code.

Indentation is an unwritten rule; let me reiterate: one indentation is generally one TAB key or four spaces (TAB is preferred).

ii) Spaces. Can spaces cause any loss to the program? No, effectively using spaces can make your program more visually appealing. Instead of having a bunch of expressions crammed together. Take a look at the code below:

ha=(ha*128+*key++)%tabPtr->size;
ha = ( ha * 128 + *key++ ) % tabPtr->size;

Having spaces and not having spaces feels different, right? Generally, spaces should be added between operators in statements, and when calling functions, spaces should be added between parameters. For example, the following with spaces and without:

if ((hProc=OpenProcess(PROCESS_ALL_ACCESS,FALSE,pid))==NULL){
}
if ( ( hProc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid) ) == NULL ) {
}

iii) Line breaks. Don’t write all statements on one line; that’s bad. For example:

for(i=0;i<len;i++) if((a[i]<'0'||a[i]>'9')&& (a[i]<'a'||a[i]>'z')) break;

This kind of code, with neither spaces nor line breaks, what is it doing? Add spaces and line breaks.

for ( i=0; i<len; i++) {
  if ( ( a[i] < '0' || a[i] > '9' ) &&
     ( a[i] < 'a' || a[i] > 'z' ) ) {
    break;
  }
}

Much better, right? Sometimes, when there are many function parameters, it’s best to break them into new lines:

CreateProcess(
  NULL,
  cmdbuf,
  NULL,
  NULL,
  bInhH,
  dwCrtFlags,
  envbuf,
  NULL,
  &siStartInfo,
  &prInfo
);

Conditional statements should also be broken into new lines when necessary:

if ( ch >= '0' || ch <= '9' ||
ch >= 'a' || ch <= 'z' ||
ch >= 'A' || ch <= 'Z' )

iv) Blank lines. Don’t forget to add blank lines; they can separate different program blocks. It’s best to add blank lines between program blocks. For example:

HANDLE hProcess;
PROCESS_T procInfo;

/* open the process handle */
if((hProcess = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid)) == NULL)
{
  return LSE_MISC_SYS;
}

memset(&procInfo, 0, sizeof(procInfo));
procInfo.idProc = pid;
procInfo.hdProc = hProcess;
procInfo.misc |= MSCAVA_PROC;

return(0);

v) Alignment. Use TAB keys to align some variable declarations or comments; it will also make your program look better. For example:

typedef struct _pt_man_t_ {
  int numProc; /* Number of processes */
  int maxProc; /* Max Number of processes */
  int numEvnt; /* Number of events */
  int maxEvnt; /* Max Number of events */
  HANDLE* pHndEvnt; /* Array of events */
  DWORD timeout; /* Time out interval */
  HANDLE hPipe; /* Named pipe */
  TCHAR usr[MAXUSR];/* User name of the process */
  int numMsg; /* Number of Messages */
  int Msg[MAXMSG];/* Space for intro process communication */
} PT_MAN_T;

How does that feel? Looks good, right? This section mainly describes how to write code that is pleasing to the eye. Good-looking code makes reading it less tiring. Well-organized and neat code is usually more welcome and praised. With today’s large hard drives, don’t let your code be crammed together; otherwise, they will complain that you are abusing them.

Alright, use “indentation, spaces, line breaks, blank lines, and alignment” to decorate your code, turning them from chaotic bandits into orderly regular troops.

3. Program Comments

Develop the habit of writing program comments; this is a must for every programmer. I have seen programs with thousands of lines but not a single comment. This is like driving on a road without signs. Before long, you won’t even know your own intentions, and you will spend several times longer understanding it. Such a waste of others’ and your own time is most shameful.

Yes, you might say you can write comments. Really? The way you write comments can also reflect a programmer’s skill. Generally, you need to write comments for at least these places: file comments, function comments, variable comments, algorithm comments, and functional block comments.

It mainly records what this piece of code does, what your intention is, and what this variable is used for, etc.

Don’t think comments are easy to write; some algorithms are hard to express or write down, and can only be understood intuitively. I admit that there are such situations, but you should still write them down; it’s a good exercise for your expression skills.

Expression ability is precisely what technical personnel who are too focused on technology lack. No matter how high your skills are, if your expression ability is poor, your skills cannot be fully utilized. After all, this is an era of teamwork.

Now, let me mention a few technical details about comments:

i) I don’t quite agree that line comments (“//”) are better than block comments (“/* */”). Some older versions of C compilers do not support line comments, so for the portability of your program, you should still use block comments as much as possible.

ii) You may be unhappy that block comments cannot be nested, so you can use preprocessor directives to achieve this. Code enclosed with “#if 0” and “#endif” will not be compiled and can be nested.

4. [in][out] Parameters of Functions

I often see programs like this:

FuncName(char* str)
{
  int len = strlen(str);
  .....
}

char*
GetUserName(struct user* pUser)
{
  return pUser->name;
}

No! Please don’t do that.

You should first check whether the incoming pointer is NULL. If the incoming pointer is NULL, then your large system could crash due to this small function. A better technique is to use assertions (assert), but I won’t go into those technical details here.

Of course, in C++, references are much better than pointers, but you still need to check each parameter.

When writing functions with parameters, the first task is to check the validity of all incoming parameters. The outgoing parameters should also be checked, and this action should be done externally, meaning that after calling a function, the returned values should be verified.

Of course, checks will take some time, but to avoid system-level errors like “illegal operation” or “Core Dump,” spending this little extra time is worth it.

5. Judging the Return of System Calls

Continuing from the previous point, for some system calls, such as opening files, I often see many programmers who do not check the pointer returned by fopen and use it directly. Then they find that they cannot read or write the file’s contents. It’s better to check:

fp = fopen("log.txt", "a");
if ( fp == NULL ){
  printf("Error: open file error\n");
  return FALSE;
}

There are many other examples, such as the socket number returned by socket and the memory returned by malloc. Please check the returns of these system calls.

6. Handling Errors in if Statements

I see you saying, what’s there to say about this? Let’s first look at a piece of code.

if ( ch >= '0' && ch <= '9' ){
  /* Normal handling code */
}else{
  /* Output error information */
  printf("error ......\n");
  return ( FALSE );
}

This structure is not good, especially if the “normal handling code” is long. For such cases, it’s best not to use else. First, check for errors, like this:

if ( ch < '0' || ch > '9' ){
  /* Output error information */
  printf("error ......\n");
  return ( FALSE );
}
/* Normal handling code */
......

This structure is much clearer, highlighting the error conditions so that others using your function can see the illegal conditions at first glance and thus avoid them more instinctively.

7. #ifndef in Header Files

Never overlook the #ifndef in header files; this is a critical element. For example, if you have two C files that both include the same header file, and these two C files are compiled together into a single executable, then the problem arises: a lot of declaration conflicts.

So, place the contents of the header file within #ifndef and #endif. Regardless of whether your header file will be included by multiple files, you should always add this. The general format is as follows:

#ifndef <identifier>
#define <identifier>

......

#endif

<identifier> can theoretically be named freely, but each header file’s identifier should be unique. The naming convention is generally to use the header file name in uppercase, with underscores at both ends, replacing any periods in the file name with underscores, such as: stdio.h

#ifndef _STDIO_H_
#define _STDIO_H_

......

#endif

(BTW: Preprocessing has many useful features. Do you know how to use preprocessing?)

8. Allocating Memory on the Heap

Many people are still unclear about the concepts of “stack” and “heap” in memory allocation. Even some professionals may not fully understand these two concepts. I don’t want to elaborate too much on these.

In simple terms, memory allocated on the stack is automatically released by the system, while memory allocated on the heap is not released by the system, even when the program exits, that memory remains. Stack memory is generally statically allocated, while heap memory is generally dynamically allocated.

Memory allocated using the malloc system function comes from the heap. Memory allocated from the heap must be released manually. Use free to release it; otherwise, it leads to a “memory leak”.

Consequently, the amount of allocatable memory will decrease with each malloc until the system crashes. Let’s look at the differences between stack and heap memory.

Stack Memory Allocation —————

char *AllocStrFromStack()
{
  char pstr[100];
  return pstr;
}

Heap Memory Allocation —————

char *AllocStrFromHeap(int len)
{
  char *pstr;
  if ( len <= 0 ) return NULL;
    return ( char* ) malloc( len );
}

For the first function, the memory for pstr will be released when the function returns. Thus, the returned char * will be nothing.

For the second function, since memory is allocated from the heap, it will not be released even when the program exits, so the memory returned from the second function is valid. However, it must be freed; otherwise, it leads to a memory leak!

Allocating memory on the heap can easily cause memory leaks, which are the biggest “enemy” of C/C++. If your program needs to be stable, then avoid memory leaks. Therefore, I must remind you repeatedly to be cautious when using the malloc system function (including calloc and realloc).

I remember a UNIX service application made up of several hundred C files. It ran well during testing, but when in use, the system would crash every three months, leaving many people at a loss to find the problem.

They had to manually restart the system every two months. This problem arose because of memory leaks; in C/C++, such issues are common, so you must be careful. A rational detection tool—Purify—can help you check if your program has memory leaks.

I guarantee that many programmers who have worked on numerous C/C++ projects will feel uneasy when using malloc or new. When you feel a slight tension and anxiety whenever you use malloc or new, you have developed this aspect of cultivation.

Here are some rules for malloc and free operations:

They should be used in pairs; for every malloc, there should be a free. (In C++, the corresponding operations are new and delete.)
Try to use them at the same level; do not have malloc in one function and free in another. It’s best to use these two functions at the same call level.
Memory allocated through malloc must be initialized. A pointer freed must be set to NULL.

Note: Although modern operating systems (like UNIX and Win2k/NT) have process memory tracking mechanisms, meaning that if you forget to release memory, the operating system will help you release it, it still won’t release all the memory that causes memory leaks in your program. Therefore, it’s best to do this work yourself. (Sometimes memory leaks occur unknowingly, and finding them in millions of lines of code is like searching for a needle in a haystack. Rational has a tool called Purify that can help you check for memory leaks in your program.)

9. Variable Initialization

Following the previous point, variables must be initialized before use. C/C++ compilers will not help you initialize them like JAVA; you must do it yourself. If you use an uninitialized variable, the result is unknown. Good programmers always initialize variables before using them. For example:

Perform memset to clear memory allocated by malloc. (You can use calloc to allocate a block of zeroed memory.)
Initialize structures or arrays allocated on the stack. (It’s best to clear them as well.)

However, let’s go back to the point: initialization can also incur a certain overhead on system runtime, so don’t initialize every variable; that would be meaningless. Good programmers know which variables need to be initialized and which do not. For example, in the following case, initialization is unnecessary.

char *pstr; /* A string */
pstr = ( char* ) malloc( 50 );
if ( pstr == NULL ) exit(0);
strcpy( pstr, "Hello World" );

But in the following case, it’s best to initialize the memory. (Pointers are dangerous; they must be initialized.)

char **pstr; /* An array of strings */
pstr = ( char** ) malloc( 50 );
if ( pstr == NULL ) exit(0);

/* Let the pointers in the array point to NULL */
memset( pstr, 0, 50*sizeof(char*));

For global variables and static variables, they must be initialized at the time of declaration. Because you don’t know where they will be used first, initializing these variables before use is not practical; they must be initialized at the time of declaration. For example:

Links *plnk = NULL; /* Initialize the global variable plnk to NULL */

10. Usage of h and c Files

How should h files and c files be used? Generally, h files should contain declarations, while c files should contain definitions. Because c files need to be compiled into library files (in Windows, it’s .obj/.lib; in UNIX, it’s .o/.a), if others want to use your functions, they need to reference your h files. Therefore, h files generally contain variable, macro definitions, enums, structures, and function interface declarations, acting like an interface specification file. The c file, on the other hand, contains implementation details.

The greatest benefit of separating h files and c files is that it separates declarations from implementations. This feature is widely recognized, yet I still see some people like to write functions in h files, which is a bad habit. (In C++, for template functions, in VC, you can only write both the implementation and declaration in one file since VC does not support the export keyword.)

Moreover, if you write function implementations in h files, you will also have to add dependency relationships for the header files in the makefile, which would make your makefile less standardized.

Lastly, one of the most important things to note is: do not place initialized global variables in h files!

For example, there is a structure for handling error messages:

char* errmsg[] = {
  /* 0 */ "No error",
  /* 1 */ "Open file error",
  /* 2 */ "Failed in sending/receiving a message",
  /* 3 */ "Bad arguments",
  /* 4 */ "Memory is not enough",
  /* 5 */ "Service is down; try later",
  /* 6 */ "Unknown information",
  /* 7 */ "A socket operation has failed",
  /* 8 */ "Permission denied",
  /* 9 */ "Bad configuration file format",
  /* 10 */ "Communication timeout",
  ......
  ......
};

Do not place this in a header file because if your header file is used by five function libraries (.lib or .a), it will be linked into those five .lib or .a files. If your program uses functions from these five libraries, and all these functions use this error message array, then there will be five copies of this information in your executable file. If your errmsg is large, and you use more function libraries, your executable will become large.

The correct approach is to write it in the c file and add an external declaration in each c file that needs to use errmsg, like extern char* errmsg[];. This way, the compiler will only manage it at link time, ensuring that only one errmsg exists in the executable, and this also promotes encapsulation.

I once encountered the most ridiculous situation: there were 112 copies of errmsg in my target file, and the executable was about 8M. When I moved errmsg to the c file and added extern declarations to over a thousand c files, the sizes of all function library files dropped by about 20%, and my executable was only 5M. I suddenly saved 3M!

Note:

A friend told me that this is just a special case because if errmsg has multiple copies in the executable, it can speed up program execution. The reason is that multiple copies of errmsg reduce the system’s memory paging, improving efficiency. In the case we discussed, there is only one errmsg; when a function needs to use errmsg, if the memory is far apart, it will generate a page fault, which can lower efficiency.

This reasoning is not without merit, but generally speaking, for a larger system, errmsg is relatively large, so creating copies leads to a larger executable, which increases system load time and can cause the program to occupy more pages in memory.

For data like errmsg, it is generally not often used during system runtime, so the generated memory paging is not frequent. Weighing the pros and cons, having only one errmsg is more efficient. Even for frequently used data like logmsg, the operating system’s memory scheduling algorithms will keep such frequently used pages resident in memory, so paging issues will not arise.

11. Handling Error Messages

Do you know how to handle error messages? Oh, it’s not as simple as just outputting them. Look at the example below:

if ( p == NULL ){
  printf ( "ERR: The pointer is NULL\n" );
}

Say goodbye to student-level programming. This kind of programming is not conducive to maintenance and management; error messages or prompts should be handled uniformly, not hard-coded like above. The handling of error messages was partially addressed in point 10. If you want to manage error messages, you need to have the following handling:

/* Declare error codes */
#define ERR_NO_ERROR 0 /* No error */
#define ERR_OPEN_FILE 1 /* Open file error */
#define ERR_SEND_MESG 2 /* sending a message error */
#define ERR_BAD_ARGS 3 /* Bad arguments */
#define ERR_MEM_NONE 4 /* Memory is not enough */
#define ERR_SERV_DOWN 5 /* Service down, try later */
#define ERR_UNKNOW_INFO 6 /* Unknown information */
#define ERR_SOCKET_ERR 7 /* Socket operation failed */
#define ERR_PERMISSION 8 /* Permission denied */
#define ERR_BAD_FORMAT 9 /* Bad configuration file */
#define ERR_TIME_OUT 10 /* Communication timeout */

/* Declare error messages */
char* errmsg[] = {
/* 0 */ "No error",
/* 1 */ "Open file error",
/* 2 */ "Failed in sending/receiving a message",
/* 3 */ "Bad arguments",
/* 4 */ "Memory is not enough",
/* 5 */ "Service is down; try later",
/* 6 */ "Unknown information",
/* 7 */ "A socket operation has failed",
/* 8 */ "Permission denied",
/* 9 */ "Bad configuration file format",
/* 10 */ "Communication timeout",
};

/* Declare global variable for error code */
long errno = 0;

/* Function to print error messages */
void perror( char* info)
{
  if ( info ){
    printf("%s: %s\n", info, errmsg[errno] );
    return;
  }
  printf("Error: %s\n", errmsg[errno] );
}

This is basically the implementation detail of ANSI error handling. So when you have an error in your program, you can handle it like this:

bool CheckPermission( char* userName )
{
  if ( strcpy(userName, "root") != 0 ){
  errno = ERR_PERMISSION_DENIED;
  return (FALSE);
}

...
}

main()
{
  ...
  if (! CheckPermission( username ) ){
  perror("main()");
}
  ...
}

This provides a common and personalized error message handling mechanism, allowing uniform output for the same type of error, maintaining a consistent user interface. This also makes maintenance much easier. The code is also more readable.

Of course, too much of a good thing can backfire, and there’s no need to put all outputs into errmsg. Extracting the more important error messages or prompts is key, but even so, this includes the majority of messages.

12. Common Functions and Computations in Loop Statements

Take a look at the following example:

for( i=0; i<1000; i++ ){
  GetLocalHostName( hostname );
  ...
}

GetLocalHostName means getting the current computer name; it will be called 1000 times in the loop body. How inefficient is that? This function should be moved outside the loop so that it is only called once, greatly improving efficiency.

Although our compiler will optimize and move invariant elements out of the loop, do you trust that all compilers will recognize which elements are invariant? I think compilers are unreliable. It’s best to do it manually.

Similarly, for invariant elements in commonly used functions, such as:

GetLocalHostName(char* name)
{
  char funcName[] = "GetLocalHostName";

  sys_log( "%s begin......", funcName );
  ...
  sys_log( "%s end......", funcName );
}

If this is a frequently called function, allocating memory for funcName each time it is called incurs a high overhead. Declare this variable as static; when the function is called again, it will use the previously allocated value, saving the overhead of memory allocation and improving execution efficiency.

13. Naming of Function and Variable Names

I see many programs with haphazard naming of variable and function names, especially variable names, such as a, b, c, aa, bb, cc, and flag1, flag2, cnt1, cnt2. This is also a behavior that lacks “cultivation”. Even with good comments, good variable or function names should follow these rules:

They should be intuitive and pronounceable, conveying meaning without needing “decoding.” The length of the name should be the shortest possible while maximizing its meaning. Avoid all uppercase or all lowercase; a mix of both is preferred, such as GetLocalHostName or UserAccount. Abbreviations can be used, but they should be understandable, like ErrorCode -> ErrCode, ServerListener -> ServLisner, UserAccount -> UsrAcct, etc. To avoid conflicts with global function and variable names, prefixes can be added, generally using the module abbreviation as a prefix. Global variables should uniformly have a prefix or suffix, so that anyone seeing this variable knows it is global. Use Hungarian notation for function parameters and local variables, but still adhere to the principle of “conveying meaning.” Maintain consistency with the naming conventions of standard libraries (like STL) or development libraries (like MFC).

14. Passing Values and Pointers as Function Parameters

When passing parameters to functions, generally, passing a non-const pointer indicates that the function intends to modify the data pointed to by this pointer. If it is a value, then regardless of how it is modified inside the function, it does not affect the passed value, because passing by value is just a memory copy.

What? You say you understand this feature? Alright, let’s look at the following example:

void GetVersion(char* pStr)
{
  pStr = malloc(10);
  strcpy ( pStr, "2.0" );
}

main()
{
  char* ver = NULL;
  GetVersion ( ver );
  ...
  ...
  free ( ver );
}

I guarantee that similar problems are the most common mistakes made by novices in programming. The program attempts to allocate space for pointer ver through the function GetVersion, but this method is ineffective because—this is passing by value, not passing by pointer. You might argue that I am clearly passing a pointer, right? Look closely; what you are actually passing is the value of the pointer.

15. Cultivation in Modifying Others’ Programs

When maintaining someone else’s program, please do not subjectively delete or modify existing code. I often see programmers directly modifying expressions or statements in others’ code.

When modifying someone else’s program, do not delete their code. If you think there’s something wrong with it, comment it out and then add your own handling procedure. After all, you cannot know 100% of others’ intentions, so to allow for recovery, do not rely solely on version control software like CVS or SourceSafe; you should also make your modifications clear in the source code. This is what a cultivated programmer should do during software maintenance.

Here’s a better way to modify:

/*
* ----- commented by haoel 2003/04/12 ------
*
* char* p = ( char* ) malloc( 10 );
* memset( p, 0, 10 );
*/

/* ------ Added by haoel 2003/04/12 ----- */
char* p = ( char* )calloc( 10, sizeof char );
/* ---------------------------------------- */
...

Of course, this method is used in software maintenance. This method allows future maintainers to easily understand the previous code changes and intentions, and it also shows respect for the original author.

Using the “comment – add” method to modify someone else’s program is better than directly deleting their code.

16. Forming Functions and Macros from Similar or Nearly Identical Code

Some say the best programmers are the ones who like to “be lazy,” and there’s some truth to that.

If you have code snippets that are very similar or identical, please place them in a function. If the code is not extensive and will be used frequently, and you want to avoid the overhead of function calls, then write it as a macro.

Never let the same or nearly identical code exist in multiple places; otherwise, if the functionality changes, you will have to modify several locations, which can create significant maintenance headaches. Therefore, achieving “one change affects all” is essential, so form functions or macros.

17. Parentheses in Expressions

If you have a complex expression and are not very clear about the precedence of the operators, even if you are very clear about the precedence, please add parentheses. Otherwise, when others or you read the code next time, you might misinterpret it. To avoid such “misunderstandings” and to make your code clearer, it’s best to add parentheses.

For example, taking the address of a structure member:

GetUserAge( &( UserInfo->age ) ); Although &UserInfo->age has the highest precedence, adding parentheses will make it immediately clear what your code means.

Another example is a lengthy conditional judgment:

if ( ( ch[0] >= '0' || ch[0] <= '9' ) &&
   ( ch[1] >= 'a' || ch[1] <= 'z' ) &&
   ( ch[2] >= 'A' || ch[2] <= 'Z' ) )

With parentheses, plus spaces and line breaks, isn’t your code much easier to read?

18. Using const in Function Parameters

For some pointer parameters in functions, if you only intend to read them in the function, please use const to modify them. This way, when others read your function interface, they will know your intention is that this parameter is [in]. If there is no const, the parameter indicates [in/out]. Be attentive to the use of const in function interfaces, as it aids in program maintenance and helps avoid errors.

Although the const modifier for pointers, such as const char* p, has no effect in C, since whether you declare it as const or not, the content pointed to can still be modified (the compiler will enforce this), adding such a declaration benefits the readability and compilation of your program. In C, modifying memory pointed to by a const pointer will raise a warning, which will alert programmers.

In C++, the definition of const is much stricter, so it’s essential to use const more frequently in C++, including const member functions and const variables, to make your code and program more complete and readable. (I won’t elaborate further on const in C++.)

19. Number of Function Parameters (Use Structures if Too Many)

It’s best not to have too many parameters in functions; generally, around six is sufficient. Numerous function parameters can make it dizzying for readers of the code and are not conducive to maintenance. If there are too many parameters, please use a structure to pass them. This helps encapsulate data and simplifies the program.

It also benefits function users because if you have many function parameters, such as twelve, the caller can easily get the order and number of parameters wrong. By using a structure to pass parameters, the order of parameters becomes irrelevant.

Moreover, functions are easy to modify; if you need to add parameters, you don’t need to change the function interface; just change the structure and the internal handling of the function, while this action remains transparent to the calling program.

20. Do Not Omit the Return Type of Functions

I see many programs that do not pay attention to the return type when writing functions. If a function has no return value, please add the void modifier in front of it. Some programmers are lazy and omit the return type for functions returning int (since if omitted, it defaults to int), which is a bad habit. It’s better for the readability of the original code to include int.

Therefore, please do not omit the return value type of functions.

Additionally, for void functions, we often forget to use return. Some C/C++ compilers are sensitive to this and will raise warnings, so even for void functions, it’s best to include a return statement inside, as it aids in code compilation.

21. Usage of goto Statements

Years ago, the software development master Dijkstra said, “goto statement is harmful!!” and recommended eliminating goto statements. Because goto statements are not conducive to the maintainability of program code.

I strongly recommend against using goto statements unless in the following situation:

#define FREE(p) if(p) { \
  free(p); \
  p = NULL; \
}

main()
{
  char *fname=NULL, *lname=NULL, *mname=NULL;

  fname = ( char* ) calloc ( 20, sizeof(char) );
  if ( fname == NULL ){
    goto ErrHandle;
  }

  lname = ( char* ) calloc ( 20, sizeof(char) );
  if ( lname == NULL ){
    goto ErrHandle;
  }

  mname = ( char* ) calloc ( 20, sizeof(char) );
  if ( mname == NULL ){
    goto ErrHandle;
  }

  ......

  ErrHandle:
  FREE(fname);
  FREE(lname);
  FREE(mname);
  ReportError(ERR_NO_MEMORY);
}

Only in such cases will the goto statement make your program more readable and easier to maintain. (You may also encounter such structures when using embedded C for database cursor operations or establishing database connections.)

22. Using Macros

Many programmers do not know what “macros” in C mean. Especially when macros have parameters, they often confuse macros with functions. I want to explain “macros” here. A macro is simply a definition that defines a block of statements. When the program compiles, the compiler first performs a “replacement” action on the source program, replacing instances of the macro with the macro-defined statement block, similar to text file replacement. This action is called “macro expansion”.

Using macros can be quite “dangerous” because you don’t know what the macro will look like after expansion. For example, the following macro:

#define MAX(a, b) a>b?a:b

When we use the macro like this, there’s no problem: MAX( num1, num2 ); because after macro expansion, it becomes num1>num2?num1:num2;. But what if we call it this way: MAX( 17+32, 25+21 );? During compilation, it will become: 17+32>25+21?17+32:25+21, wow, what is this?

Therefore, when using macros, parameters must be enclosed in parentheses. The above example can be modified as follows to resolve the issue.

#define MAX( (a), (b) ) ((a)>(b)?(a):(b))

Even then, this macro still has bugs. Because if I call it like MAX(i++, j++); after this macro, both i and j will be incremented twice, which is not what we want.

Thus, be cautious when using macros, as the results of macro expansion can be hard to predict. Although macros execute quickly (due to no function call overhead), they can bloat source code, increasing the size of the target file (for example, a 50-line macro used in 1000 places will lead to significant bloat). Conversely, this does not speed up program execution (because a larger executable leads to frequent paging during runtime).

Therefore, be careful when deciding whether to use a function or a macro.

23. Usage of static

The static keyword indicates “static”. Generally, it is frequently used for variables and functions. A static variable is essentially a global variable, but it has a limited scope. For example, a static variable inside a function:

char* getConsumerName()
{
  static int cnt = 0;

  ....
  cnt++;
....
}

The value of the cnt variable will increase with each call to the function; after the function exits, the value of cnt still exists, but it can only be accessed within the function. Moreover, memory for cnt will only be allocated and initialized the first time the function is called; subsequent calls will use the value from the previous call without reallocating memory for static.

For frequently called functions, it’s also best to declare constants as static (refer to point 12).

However, the greatest utility of static lies in controlling access; in C, if a function or global variable is declared static, that function and global variable can only be accessed within that C file. If another C file tries to call that function or use that global variable (with the extern keyword), it will generate a linking error. This feature can be used for data and program confidentiality.

24. Code Size in Functions

A function should accomplish a specific task; generally, the code in a function should not exceed around 600 lines, the fewer the better. The best functions are generally under 100 lines, while about 300 lines is acceptable. Evidence suggests that if the code in a function exceeds 500 lines, it likely contains similar or identical code to another function, indicating that another function can be written.

Additionally, functions should accomplish a single specific task; avoid doing many different things within one function. The more single-purpose a function is, the better its readability, and it is more conducive to code maintenance and reuse. A more single-purpose function is more likely to serve more programs, meaning it has more commonality.

Although function calls incur some overhead, compared to the maintenance of software later on, incurring a bit of runtime overhead for better maintainability and code reusability is very worthwhile.

25. Usage of typedef

typedef is a keyword for giving a type an alias. Don’t underestimate it; it can greatly aid in maintaining your code. For example, C does not have bool, so in one software project, some programmers used int while others used short, leading to confusion. It’s best to define it with a typedef, like:

typedef char bool;

In general, a C project should do some work in this area because you will be dealing with cross-platform issues, and different platforms will have different word lengths. Therefore, using preprocessor directives and typedef can help you maintain your code effectively, as shown below:

#ifdef SOLARIS2_5
typedef boolean_t BOOL_T;
#else
typedef int BOOL_T;
#endif

typedef short INT16_T;
typedef unsigned short UINT16_T;
typedef int INT32_T;
typedef unsigned int UINT32_T;

#ifdef WIN32
typedef _int64 INT64_T;
#else
typedef long long INT64_T;
#endif

typedef float FLOAT32_T;
typedef char* STRING_T;
typedef unsigned char BYTE_T;
typedef time_t TIME_T;
typedef INT32_T PID_T;

Using typedef for structures and function pointers is also recommended, as it enhances readability and maintainability of the program. For example:

typedef struct _hostinfo {
  HOSTID_T host;
  INT32_T hostId;
  STRING_T hostType;
  STRING_T hostModel;
  FLOAT32_T cpuFactor;
  INT32_T numCPUs;
  INT32_T nDisks;
  INT32_T memory;
  INT32_T swap;
} HostInfo;


typedef INT32_T (*RsrcReqHandler)(
  void *info,
  JobArray *jobs,
  AllocInfo *allocInfo,
  AllocList *allocList);

In C++, this is also very readable:

typedef CArray<HostInfo, HostInfo&> HostInfoArray;

Thus, when we define variables, it becomes very readable:

HostInfo* phinfo;
RsrcReqHandler* pRsrcHand;

This clarity in variable type declarations is particularly evident in function parameters.

The key point is that by using typedef, nearly all type declarations in the program become concise and clear, making maintenance easier; this is the essence of typedef.

26. Declare Macros for Constants

It’s best not to have hard-coded numeric values in your program, such as:

int user[120];

Declare a macro for this 120. All constants appearing in the program should be declared as macros. For instance, the timeout duration, maximum user count, and others should all be declared as macros. If a piece of code suddenly appears in the program:

for ( i=0; i<120; i++){
  ....
}

What does 120 mean? Why is it 120? Such “hard coding” makes programs hard to read and maintain. If you need to change this number, you will have to modify all instances of 120 in the program, which is a significant burden for the person modifying the program. Thus, declare constants as macros; this way, one change affects all, and it enhances program readability.

#define MAX_USER_CNT 120

for ( i=0; i<MAX_USER_CNT; i++){
  ....
}

This makes the intention of the code very clear.

Some programmers like to declare global variables for such constants, but in fact, global variables should be minimized because they hinder encapsulation and maintenance and can introduce certain overheads in execution space, potentially causing system paging and affecting program execution efficiency. Therefore, declaring them as macros avoids the overhead of global variables and offers speed advantages.

27. Do Not Add Semicolons to Macro Definitions

Many programmers do not know whether to add semicolons to macro definitions. Sometimes they think macros are statements and should have semicolons, which is incorrect. Once you understand how macros work, you will agree with me why semicolons should not be added to macro definitions. Take a look at an example:

#define MAXNUM 1024;

This macro has a semicolon. If we use it like this:

half = MAXNUM/2;

if ( num < MAXNUM )

And so on, it will cause compilation errors because when the macro expands, it becomes:

half = 1024;/2;

if ( num < 1024; )

Yes, the semicolon also gets included, leading to errors in the program. Trust me, sometimes a single semicolon can cause hundreds of errors in your program. So, do not add a final semicolon to macros, even in cases like:

#define LINE "================================="

#define PRINT_LINE printf(LINE)

#define PRINT_NLINE(n) while ( n-- >0 ) { PRINT_LINE; }

Do not add a semicolon at the end of the last line. When we use it in the program, we can add a semicolon:

main()
{
  char *p = LINE;
  PRINT_LINE;
}

This is very much in line with conventions, and if you forget to add a semicolon, the compiler will give error messages that are easy to understand.

28. Execution Order of || and && Statements

Be cautious with the “and” and “or” operators in conditional statements; their behavior may differ from what you imagine. Some behaviors in conditional statements need to be clarified:

express1 || express2

First, evaluate express1; if it is “true,” express2 will not be executed. Express2 is only executed if express1 is “false.” Because if the first expression is true, the entire expression is true, so there’s no need to evaluate the second expression.

express1 && express2

First, evaluate express1; if it is “false,” express2 will not be executed. Express2 is only executed if express1 is “true.” Because if the first expression is false, the entire expression is false, so there’s no need to evaluate the second expression.

Thus, not all statements in conditional statements will be executed. Keep this in mind, especially when you have many conditions in your statements; this point is especially important.

For example, in the following program:

if ( sum > 100 &&
( ( fp=fopen( filename,"a" ) ) != NULL ) ) {

  fprintf(fp, "Warning: it beyond one hundred\n");
  ......
}

fprintf( fp, " sum is %id \n", sum );
fclose( fp );

The original intention was that if sum > 100, an error message would be written to the file. For convenience, the two conditional checks were combined. If sum <= 100, the file opening operation will be skipped, and finally, fprintf and fclose will yield unknown results.

Another example: if I want to check if a character has content, I need to check if the string pointer is not NULL (indicating it’s not empty) and that its content is not empty. I might write:

if ( ( p != NULL ) && ( strlen(p) != 0 ))

Thus, if p is NULL, then strlen(p) will not be executed, preventing “illegal operation” or “Core Dump” due to a null pointer.

Remember, not all statements in conditional statements will execute. This point is especially important when you have many conditions in your statements.

29. Prefer for Loops Over while Loops

Generally speaking, for loops can accomplish the functions of while loops. I recommend using for statements as much as possible instead of while statements, especially when the loop body is large. The advantages of for loops become apparent.

In for loops, the initialization, ending conditions, and loop progression are all together; you can immediately see what kind of loop it is. Newly graduated programmers tend to link like this:

p = pHead;

while ( p ){
  ...
  ...
  p = p->next;
}

When the while statement block grows larger, it becomes difficult to read your program. Using for is much clearer:

for ( p=pHead; p; p=p->next ){
  ..
}

At a glance, you can see the starting condition, ending condition, and loop progression, and you can roughly understand what this loop is supposed to accomplish. Moreover, it makes maintenance easier, as you don’t have to scroll up and down in an editor like with while.

30. Use sizeof Types Instead of Variables

Many programmers like to use sizeof variables, such as:

int score[100];
char filename[20];
struct UserInfo usr[100];

When using sizeof these three variable names, they will return correct results, so many programmers start using sizeof variable names. While this habit is not inherently wrong, I still recommend using sizeof types.

I have seen the following program:

pScore = (int*) malloc( SUBJECT_CNT );
memset( pScore, 0, sizeof(pScore) );
...

At this point, sizeof(pScore) returns 4 (the length of the pointer) instead of the entire array, meaning memset cannot initialize that block of memory. For the readability and maintainability of the program, I strongly recommend using types instead of variables, like:

For score: sizeof(int) * 100 /* 100 ints */
For filename: sizeof(char) * 20 /* 20 chars */
For usr: sizeof(struct UserInfo) * 100 /* 100 UserInfo */

This code is much more readable; at a glance, you can understand its meaning.

Another point is that sizeof is generally used for memory allocation, and this feature is especially evident in multi-dimensional arrays. For example, when allocating memory for an array of strings:

/*
* Allocate memory for 20 strings,
* Each string being 100 long
*/

char* *p;

/*
* Incorrect allocation method
*/
p = (char**)calloc( 20*100, sizeof(char) );


/*
* Correct allocation method
*/
p = (char**) calloc ( 20, sizeof(char*) );
for ( i=0; i<20; i++){
  /*p = (char*) calloc ( 100, sizeof(char) );*/
  p[i] = (char*) calloc ( 100, sizeof(char) );
}

(Note: The statement above that has been commented out is the original, incorrect method, pointed out by dasherest friend, thank you.)

For the sake of readability of the code, let’s avoid some checks. Pay attention to the substantial difference between these two allocation methods.

31. Do Not Ignore Warnings

Do not ignore warning messages during compilation. Although these warnings do not prevent the generation of target code, that does not mean your program is good. After all, a program that compiles successfully is not necessarily correct; compiling successfully is just the first step of a long journey, and there are still many challenges ahead.

From the start of compiling the program, not only should you correct every error, but you should also fix every warning. This is what a cultivated programmer should do.

Common warning messages include:

1) Declaring unused variables. (Although the compiler will not compile this variable, it’s best to comment it out or delete it from the source code.)

2) Using ambiguously declared functions. (Perhaps this function is in another C file, and this warning will arise during compilation; you should declare this function using the extern keyword before using it.)

3) Not converting a pointer. (For example, if the pointer returned by malloc is void, and you do not convert it to your actual type, a warning will be raised. You should manually convert it to the appropriate type beforehand.)

4) Type downcasting. (For example: float f = 2.0; this statement will raise a warning indicating that you are trying to convert a double to float, effectively truncating a variable. Do you really want to do this? If so, add an f after 2.0, otherwise, 2.0 is a double, not a float.)

In any case, do not underestimate compiler warnings; it’s best not to ignore them. A program can compile successfully; how much more so for a few small warnings?

32. Writing Debug and Release Versions of Programs

During the development of a program, many debugging messages are added by programmers. I have seen many project teams ask everyone to delete debugging information from the program when development is finished. Why not establish two versions of the target code, one for debug and one for release, like VC++ does?

Those debugging messages are invaluable and are also precious for future maintenance, so how can they be deleted just like that?

Utilize preprocessing technology, as shown below, to declare debugging functions:

#ifdef DEBUG 
void TRACE(char* fmt, ...) 
{ ...... 
} 
#else 
#define TRACE(char* fmt, ...) 
#endif

Thus, let all programs use TRACE to output debugging information; you only need to add a parameter “-DDEBUG” during compilation, like:

cc -DDEBUG -o target target.c

Then, when the preprocessor finds that the DEBUG variable is defined, it will use the TRACE function. If you want to release it to users, you only need to remove the “-DDEBUG” parameter, and all instances of the TRACE macro will be replaced with nothing, effectively removing all debugging information in the source code.

In summary, the thirty-two points mentioned are aimed at three main goals:

1. Readability of program code.

2. Maintainability of program code.

3. Stability and reliability of program code.

A cultivated programmer should learn to write such code! This is a small issue that anyone aspiring to be a programming expert must face. Programming experts not only need strong skills and a solid foundation but, most importantly, must have “cultivation”!

Good software products are not only about technology but also about the overall maintainability and reliability of the software.

Software maintenance requires a significant amount of work to be spent on code maintenance, and software upgrades also require a great deal of effort in code organization. Therefore, good, clear, and readable code will greatly reduce the maintenance and upgrade costs of software.

This article is sourced from the internet, for reference only and study. If there are copyright issues with the works involved, please contact me for deletion.

Recommended Previous Articles: