There are generally two methods for implementing precise delays in the 51 microcontroller: one is hardware delay, which requires the use of timers/counters. This method can improve CPU efficiency and achieve precise delays; the other is software delay, which mainly uses loops.
1 Using Timer/Counter for Precise Delay
The microcontroller system typically selects an 11.0592 MHz, 12 MHz, or 6 MHz crystal oscillator. The first option is more likely to produce various standard baud rates, while the latter two have machine cycles of 1 μs and 2 μs, making it easier to achieve precise delays. In this program, we assume a frequency of 12 MHz for the crystal oscillator. The maximum delay time can reach 216 = 65,536 μs. If the timer operates in mode 2, very short precise delays can be achieved; if other timing methods are used, the time to reload the timer initial value must be considered (reloading the timer initial value takes 2 machine cycles).
In practical applications, timers are often used in interrupt mode, and with appropriate looping, delays of several seconds or even longer can be achieved. Using timer/counter delays is the best solution in terms of program execution efficiency and stability. However, it should be noted that the interrupt service program written in C51 will automatically add PUSH ACC, PUSH PSW, POP PSW, and POP ACC statements after compilation, consuming 4 machine cycles during execution; if there is also an increment statement in the program, it will consume an additional 1 machine cycle. The time consumed by these statements should be considered when calculating the timer initial value, and subtracted from the initial value to achieve minimal error.
2 Software Delay and Time Calculation
In many cases, timers/counters are often used for other purposes, in which case only software methods can be used for delays. Below are several methods for software delays.
2.1 Short Delay
You can implement it in a C file by using functions with the _NOP_() statement, defining a series of different delay functions such as Delay10us(), Delay25us(), Delay40us(), etc., stored in a custom C file, which can be called directly in the main program when needed. For example, the delay function for 10 μs can be written as follows:
void Delay10us() {
_NOP_();
_NOP_();
_NOP_();
_NOP_();
_NOP_();
_NOP_();
}
The Delay10us() function uses 6 _NOP_() statements, each taking 1 μs to execute. When the main function calls Delay10us(), it first executes a LCALL instruction (2 μs), then executes 6 _NOP_() statements (6 μs), and finally executes a RET instruction (2 μs), so it takes a total of 10 μs to execute this function. This function can be considered a basic delay function, which can be called in other functions, i.e., nested calls, to achieve longer delays; however, it should be noted that if Delay40us() directly calls Delay10us() four times, the resulting delay time will be 42 μs instead of 40 μs. This is because when executing Delay40us(), it first executes a LCALL instruction (2 μs), then begins executing the first Delay10us(), and upon completing the last Delay10us(), it directly returns to the main program.
Similarly, if there are two layers of nested calls, such as calling Delay40us() twice in Delay80us(), it will also first execute a LCALL instruction (2 μs), then execute the Delay40us() function twice (84 μs), so the actual delay time is 86 μs. In short, only the innermost function executes the RET instruction. This instruction directly returns to the upper function or the main function. For example, if Delay80us() directly calls Delay10us() eight times, the delay time will be 82 μs. By modifying the basic delay function and appropriate combination calls, different delay times can be achieved.
2.2 Nesting Assembly Code in C51 for Delay
In C51, assembly language statements can be nested using the preprocessor directives #pragma asm and #pragma endasm. User-written assembly language follows immediately after #pragma asm and ends before #pragma endasm.
For example: #pragma asm
…
Assembly language code
…
#pragma endasm
The delay function can set entry parameters, which can be defined as unsigned char, int, or long type. According to the parameter and return value passing rules, the parameters and function return values are located in R7, R7R6, R7R6R5. When applying, the following points should be noted:
◆ #pragma asm and #pragma endasm cannot be nested;
◆ At the beginning of the program, the preprocessor directive #pragma asm should be added, and only comments or other preprocessor directives can precede it;
◆ When using asm statements, the compilation system does not output the target module but only the assembly source file;
◆ asm can only be in lowercase letters; if asm is written in uppercase, the compilation system treats it as a normal variable;
◆ #pragma asm, #pragma endasm, and asm can only be used within functions.
Combining assembly language with C51 to fully leverage their respective advantages is undoubtedly the best choice for microcontroller developers.
2.3 Using an Oscilloscope to Determine Delay Time
Use an oscilloscope to measure the execution time of delay programs. The method is as follows: write a function to implement the delay, set a certain I/O port line, such as P1.0, to high at the beginning of the function, and set P1.0 to low at the end of the function. In the main program, loop call this delay function, and measure the high-level time on the P1.0 pin with the oscilloscope to determine the execution time of the delay function. The method is as follows:
sbit T_point = P1^0;
void Dly1ms(void) {
unsigned int i, j;
while (1) {
T_point = 1;
for(i=0; i<2; i++) {
for(j=0; j<124; j++) {;}
}
T_point = 0;
for(i=0; i<1; i++) {
for(j=0; j<124; j++) {;}
}
}
}
void main(void) {
Dly1ms();
}
Connecting P1.0 to the oscilloscope and running the above program, you can see that the output waveform of P1.0 is a square wave with a period of 3 ms. The high level is 2 ms, and the low level is 1 ms, which means that the execution time of the for loop structure “for(j=0; j<124; j++) {;}” is 1 ms. By changing the loop count, different delay times can be obtained. Of course, other statements can also be used for delay instead of for loops. The method discussed here is just to determine the delay.
2.4 Using Disassembly Tools to Calculate Delay Time
Use the disassembly tool in Keil C51 to calculate delay time; in the disassembly window, you can display the target application program using mixed code of source program and assembly program or assembly code. To illustrate this method, we also use “for (i=0; i<”
C:0x000FE4C LRA //1T
C:0x0010FE MOV R6, A //1T
C:0x0011EE MOV A, R6 //1T
C:0x0012C3 CLRC //1T
C:0x00139F SUBB A, DlyT //1T
C:0x00145003 JNCC:0019 //2T
C:0x00160E INC R6 //1T
C:0x001780F8 SJMP C:0011 //2T
It can be seen that from 0x000F to 0x0017 there are a total of 8 instructions; analyzing the instructions reveals that not every instruction is executed DlyT times. The core loop consists of 0x0011 to 0x0017, which has a total of 6 instructions and takes a total of 8 machine cycles. The first loop first executes “CLR A” and “MOV R6, A,” which takes 2 machine cycles, and each loop takes 8 machine cycles, but the last loop takes 5 machine cycles. The core loop statement consumes (2 + DlyT × 8 + 5) machine cycles; when the system operates at 12 MHz, the precision is 7 μs.
When using a while (DlyT–) loop body, the value of DlyT is stored in R7. The corresponding assembly code is as follows:
C:0x000FAE07 MOV R6, R7 //1T
C:0x00111F DEC R7 //1T
C:0x0012EE MOV A, R6 //1T
C:0x001370 FA JNZ C:000F //2T
The execution time of the loop statement is (DlyT + 1) × 5 machine cycles, meaning that the delay precision of this loop structure is 5 μs.
Experiments have shown that if while (DlyT–) is changed to while (–DlyT), the disassembly results in the following code:
C:0x0014DFFE DJNZ R7, C:0014 //2T
It can be seen that this time the code has only 1 instruction, occupying 2 machine cycles, achieving a precision of 2 μs, and the loop body takes DlyT × 2 machine cycles; however, it should be noted that the initial value of DlyT cannot be 0.
Note: When calculating time, the function call and return each consume 2 machine cycles.