
Course OK03 is built upon Course OK02, teaching you how to use functions in assembly to make code reusable and more readable. Assuming you already have the operating system from Course 2: OK02[1], we will base our work on it.
1. Reusable Code
So far, all the code we have written has been input in the order we want things to happen. This approach is fine for very small programs, but if we write a complete system this way, the readability of the code will be very poor. We should use functions.
A function is a reusable piece of code that can be used to calculate certain answers or perform certain actions. You can also call them procedures, routines, or subroutines. Although they are different, people hardly use the term correctly.
You should have encountered the concept of functions in mathematics. For example, when applying the cosine function to a given number, you get another number between -1 and 1, which is the cosine of the angle. We generally write it as
cos(x)
to represent the cosine function applied to a valuex
.In code, a function can have multiple inputs (or none), and then the function can produce multiple outputs (or none), and may cause side effects. For example, a function can create a file on a filesystem, with the first input being its name and the second input being the length of the file.
Function as Black Boxes
Functions can be thought of as “black boxes”. We give them input, and they give us output, and we do not need to know how they work.
In high-level code like C or C++, functions are part of the language. In assembly code, functions are just our creativity.
Ideally, we want to set some input values in our registers, then branch to a certain address, and then expect to branch back to our code at some point, setting output values to the registers through the code. This is what we envision as functions in assembly code. The difficulty lies in how we set up the registers. If we just use some method we usually encounter to set the registers, each programmer may use different methods, making it hard for you to understand the code written by other programmers. Additionally, compilers cannot work as easily with assembly code because they have no idea how to use functions. To avoid this confusion, a standard called the Application Binary Interface (ABI) was designed for each assembly language, specifying how functions should operate. If everyone uses the same method to write functions, then everyone can use the functions written by others. Here, I will teach you this standard, and from now on, all the functions I write will follow this standard.
The standard specifies that registers r0
, r1
, r2
, and r3
will be used in order for function inputs. If a function has no inputs, it does not care what the values are. If it only needs one input, it should always be in register r0
. If it needs two inputs, the first input is in register r0
and the second input is in register r1
, and so on. The output value is always in register r0
. If the function has no output, then the value in r0
does not matter.
Additionally, the standard requires that after a function runs, the values of registers r4
to r12
must remain the same as they were when the function started. This means that when you call a function, you can be sure that the values in registers r4
to r12
have not changed, but you cannot be sure that the values in registers r0
to r3
have not changed.
When a function completes, it will return to the code branch that called it. This means it must know the address of the code that started it. For this purpose, a special register called lr
(link register) is used, which always saves the address of the instruction after the one that calls this function.
Table 1.1 ARM ABI Register Usage
Register | Summary | Reserved | Rules |
---|---|---|---|
r0 |
Parameters and results | No | r0 and r1 are used to pass the first two parameters to the function and the result returned by the function. If the return value is not used, they can carry any value after the function runs. |
r1 |
Parameters and results | No | |
r2 |
Parameters | No | r2 and r3 are used to pass the last two parameters to the function. After the function runs, they can carry any value. |
r3 |
Parameters | No | |
r4 |
General-purpose register | Yes | r4 to r12 are used to save values during the function’s execution, and their values must remain the same after the function call. |
r5 |
General-purpose register | Yes | |
r6 |
General-purpose register | Yes | |
r7 |
General-purpose register | Yes | |
r8 |
General-purpose register | Yes | |
r9 |
General-purpose register | Yes | |
r10 |
General-purpose register | Yes | |
r11 |
General-purpose register | Yes | |
r12 |
General-purpose register | Yes | |
lr |
Return address | No | After the function completes, lr saves the return address of the branch, but after the function completes, it will save the same address. |
sp |
Stack pointer | Yes | sp is the stack pointer, described in detail below. Its value must be the same after the function completes. |
Typically, functions need to use many registers, not just r0
to r3
. However, since the values in r4
to r12
must remain the same after the function completes, they need to be saved somewhere. We will save them in a place called the stack.
Stack Diagram
A stack is a very visual method we use in computing to save values. It is like a stack of plates where you can remove them from the top down, and when adding them, you can only add from the bottom up.
Using a stack to save register values during function execution is a very good idea. For example, if I have a function that needs to use registers
r4
andr5
, it will store the values of these registers on a stack. Finally, it can retrieve them again this way. More cleverly, if I need to run another function to finish my function, and that function needs to save some registers, while that function runs, it will save the registers at the top of the stack and then retrieve them after it ends. This will not affect the values I saved in registersr4
andr5
because they are added at the top of the stack and taken out from the top as well.The specific terminology used to represent the method of placing values onto the stack is called the “stack frame” of that method. Not every method uses a stack frame; some do not need to store values.
Because stacks are very useful, they are directly implemented in the ARMv6 instruction set. A special register called sp
(stack pointer) is used to save the address of the stack. When a value needs to be added to the stack, the sp
register is updated, ensuring it always saves the address of the first value on the stack. push {r4,r5}
will push the values in r4
and r5
onto the top of the stack, while pop {r4,r5}
will retrieve them (in the correct order).
2. Our First Function
Now that we have a concept of how functions work, let’s try to write one. Since this is our first basic example, we will write a function with no inputs that will output the address of the GPIO. In the previous course, we wrote to this value, but writing it as a function is better because we often need it in a real operating system, and we cannot always remember this address.
Copy the following code into a new file named gpio.s
, just like we used in main.s
in the source
directory. We will put all the functions related to the GPIO controller in one file, making it easier to find.
.globl GetGpioAddress
GetGpioAddress:
ldr r0,=0x20200000
mov pc,lr
.globl lbl
makes the labellbl
accessible from other files.
mov reg1,reg2
copies the value inreg2
toreg1
.
This is a very simple complete function. The command .globl GetGpioAddress
notifies the assembler to make the label GetGpioAddress
globally accessible in all files. This means that in our main.s
file, we can branch to the label GetGpioAddress
even if that label is not defined in that file.
You should recognize the command ldr r0,=0x20200000
, which saves the address of the GPIO controller into r0
. Since this is a function, we must ensure it outputs to register r0
; we cannot use any register as we did before.
mov pc,lr
copies the value in register lr
to pc
. As mentioned earlier, register lr
always saves the address of the code we want to return to after the method is completed. pc
is a special register that always contains the address of the next instruction to be run. A normal branch command only needs to change the value of this register. By copying the value in lr
to pc
, we can change the next line of code to the one we are going to return to.
Of course, there is a problem: how do we run this code? We will need a special type of branch instruction called bl
. It switches to a label like a normal branch but first updates the value of lr
to include the address of the line after that branch. This means that when the function completes, it will return to the line after the bl
instruction. This ensures that functions can run just like any other command; they simply run, do whatever needs to be done, and then proceed to the next line. This is the most useful way to understand functions. When we use them, we treat them as “black boxes”; we do not need to know how they operate, only what inputs they require and what outputs they provide.
So far, we have understood how functions are used; in the next section, we will use them.
3. A Larger Function
Now, let’s implement a larger function. Our first task is to enable the output of GPIO pin 16. It would be great if it were a function. We could simply specify a pin number and a function as inputs, and the function would set the value of that pin. This way, we could use this code to control any GPIO pin, not just the LED.
Copy the following commands into the GetGpioAddress
function in the gpio.s
file.
.globl SetGpioFunction
SetGpioFunction:
cmp r0,#53
cmpls r1,#7
movhi pc,lr
Commands with the suffix
ls
will only run if the result of the previous comparison command is that the first number is less than or equal to the second number. It is unsigned.Commands with the suffix
hi
will only run if the result of the previous comparison command is that the first number is greater than the second number. It is unsigned.
When writing a function, the first thing we need to consider is what if the input is wrong? In this function, we have an input that is the GPIO pin number, and it must be a number between 0 and 53, as there are only 54 pins. Each pin has 8 functions, numbered from 0 to 7, so the function number must also be between 0 and 7. We can assume that the input should be correct, but when using it on hardware, this practice is very dangerous because incorrect values can lead to very bad side effects. So, in this case, we want to ensure that the input values are within the correct range.
To ensure the input values are within the correct range, we need to make a check, that is, r0
<= 53 and r1
<= 7. First, we use the comparison command we saw earlier to compare the value of r0
with 53. The next instruction cmpls
will only run if the result of the previous comparison command is less than or equal to 53. If this is the case, it will compare the value of register r1
with 7; the rest is the same as before. If the final comparison result is that the register value is greater than that number, we will return to the code that runs the function.
This is exactly the effect we want. If the value in r0
is greater than 53, then the cmpls
command will not run, but movhi
will run. If the value in r0
is <= 53, then the cmpls
command will run, which will compare the value in r1
with 7, and if r1
> 7, movhi
will run, ending the function; otherwise, movhi
will not run, confirming that r0
<= 53 and r1
<= 7.
ls
(less than or equal) and le
(less than or equal) have some subtle differences, as do the suffixes hi
(greater than) and gt
(greater than); we will discuss these later.
Copy these commands to the bottom of the above code.
push {lr}
mov r2,r0
bl GetGpioAddress
push {reg1,reg2,...}
copies the listed registersreg1
,reg2
, … to the top of the stack. This command can only be used with general-purpose registers and thelr
register.
bl lbl
setslr
to the address of the next instruction and switches to labellbl
.
These three commands are used to call our first method. The command push {lr}
copies the value in lr
to the top of the stack so that we can retrieve it later. When we call GetGpioAddress
, we must do this because we will need to use lr
to save the address we want to return to from our function.
If we know nothing about the GetGpioAddress
function, we must assume that it changes the values of r0
, r1
, r2
, and r3
and moves our values to r4
and r5
to keep their values the same after the function completes. Fortunately, we know what GetGpioAddress
does, and we also know it only changes r0
to the GPIO address; it does not affect the values of r1
, r2
, or r3
. Therefore, we only need to move the GPIO pin number from r0
so that it will not be overwritten, but we know we can safely move it to r2
because GetGpioAddress
does not change r2
.
Finally, we use the bl
instruction to run GetGpioAddress
. Generally, when running a function, we use a term called “call”; from now on, we will always use this term. As we discussed earlier, bl
calls a function by updating lr
to the address of the next instruction and switching to that function when it completes.
When a function ends, we call it “returning”. When a GetGpioAddress
call returns, we already know that r0
contains the address of the GPIO, r1
contains the function number, and r2
contains the GPIO pin number.
I mentioned earlier that GPIO functions are stored in blocks of 10, so we first need to determine which block our pin is in. This seems like it would require division, but division is very slow, so for these relatively small numbers, doing repeated subtraction is better than division.
Copy the following code to the bottom of the above code.
functionLoop$:
cmp r2,#9
subhi r2,#10
addhi r0,#4
bhi functionLoop$
add reg,#val
adds the numberval
to the contents of registerreg
.
This simple loop code compares the pin number (r2
) with 9. If it is greater than 9, it will subtract 10 from the pin number and add 4 to the GPIO controller address, then run the check again.
The effect of this is that now, r2
will contain a number between 0 and 9, which is the remainder of the pin number divided by 10. r0
will contain the address of the GPIO controller for the function that is set. It is like “GPIO controller address + 4 × (GPIO pin number ÷ 10)”.
Finally, copy the following code to the bottom of the above code.
add r2, r2,lsl #1
lsl r1,r2
str r1,[r0]
pop {pc}
Shift operation
reg,lsl #val
means shifting the binary representation of the number in registerreg
left byval
bits, with the result used as an operand in the previous operation.
lsl reg,amt
shifts the binary number in registerreg
left by the number of bits inamt
.
str reg,[dst]
is the same asstr reg,[dst,#0]
.
pop {reg1,reg2,...}
copies values from the top of the stack to the list of registersreg1
,reg2
, … only general-purpose registers andpc
can be popped this way.
This code completes this method. The first line is actually a variant of multiplying by 3. Multiplication is a large and slow instruction in assembly because the circuits take a long time to give an answer. Sometimes using instructions that can quickly give an answer makes it faster. In this case, we know that r2
× 3 is equivalent to r2
× 2 + r2
. A register multiplied by 2 is very easy because it can be conveniently achieved by shifting the binary representation of the number left by one bit.
One very useful feature of ARMv6 assembly language is that you can first shift the bits represented by parameters before using them. In this case, I will add r2
to the result of shifting the binary representation of the number in r2
left by one bit. In assembly code, you can often use this trick to calculate answers faster and more easily, but if you find this trick inconvenient, you can also write it as mov r3,r2
; add r2,r3
; add r2,r3
.
Now, we can set the value of a function to the number of bits represented in r2
. Most instructions for quantities (like add
and sub
) have a variant that can use registers instead of numbers. We perform this shift because we want to set the bits representing the pin number, and each pin has three bits.
Then, we save the computed value to the address of the GPIO controller. We have already calculated that address in the loop, so we do not need to save it at an offset like we did in OK01 and OK02.
Finally, we return from this method call. Since we pushed lr
onto the stack, we pop pc
, which will copy the value in lr
and push it into pc
. This operation is similar to mov pc,lr
, so the function call will return to the line that ran it.
Observant people may notice that this function does not actually work correctly. Although it sets the GPIO pin function to the requested value, it will cause all the functions of the 10 pins in the same block to be set to 0! In a system that heavily uses GPIO pins, this will be a very annoying problem. I leave this problem for those interested in fixing this function to ensure that only the relevant 3 bits are set without overwriting other bits, while all other bits remain unchanged. Solutions to this problem can be found on the download page for this course. You may find several useful functions like and
, which computes the boolean AND of two registers, mvns
, which computes the boolean NOT, and orr
, which computes the boolean OR.
4. Another Function
Now that we have a function that can manage GPIO pin functions, we also need to write a function that can turn GPIO pins on or off. We do not need to write an open function and a close function; one function can do both.
We will write a function called SetGpio
that will take the GPIO pin number as the first input in r0
and the value as the second input in r1
. If the value is 0
, we will turn off the pin; if it is non-zero, we will turn on the pin.
Copy the following code and paste it at the end of the gpio.s
file.
.globl SetGpio
SetGpio:
pinNum .req r0
pinVal .req r1
alias .req reg
sets the alias for registerreg
asalias
.
r0
or r1
. So far, register aliases have not been very important, but as we write larger methods later, they will prove very useful; we will try to use aliases now. When using the instruction pinNum .req r0
, it means that pinNum
represents r0
.
Copy and paste the following code below the above code.
cmp pinNum,#53
movhi pc,lr
push {lr}
mov r2,pinNum
.unreq pinNum
pinNum .req r2
bl GetGpioAddress
gpioAddr .req r0
.unreq alias
deletes the aliasalias
.
pinNum
(r0
) with 53; if it is greater than 53, we will return immediately. Once we want to call GetGpioAddress
again, we need to push lr
onto the stack to protect it, move pinNum
into r2
. Then we use the .unreq
statement to delete the alias we defined for r0
. Since the pin number is now stored in register r2
, we want the alias to reflect this change, so we remove the alias from r0
and redefine it to r2
. You should always delete an alias immediately after you no longer need it, so that when it no longer exists, you will not encounter errors in the following code.
Then, we call GetGpioAddress
and create an alias pointing to r0
to reflect this change.
Copy and paste the following code after the above code.
pinBank .req r3
lsr pinBank,pinNum,#5
lsl pinBank,#2
add gpioAddr,pinBank
.unreq pinBank
lsr dst,src,#val
shifts the number insrc
right byval
bits and saves the result indst
.
pinBank
and then calculate pinNum
÷ 32. Since it is a 4-byte group, we need to multiply it by 4, which is the same as left-shifting it by 2 bits. You may wonder if we could just right-shift by 3 bits, so we would not have to shift and then shift back. But this cannot be done because when we do ÷ 32, some bits may be discarded, but if we do ÷ 8, they will not.
Now, the result in gpioAddr
could be 2020000016 (if the pin number is between 0 and 31) or 2020000416 (if the pin number is between 32 and 53). This means that if we add 2810, we will get the address to turn on the pin, and if we add 4010, we will get the address to turn off the pin. Since we have used up pinBank
, we immediately use .unreq
to delete it.
Copy and paste the following code below the above code.
and pinNum,#31
setBit .req r3
mov setBit,#1
lsl setBit,pinNum
.unreq pinNum
and reg,#val
computes the boolean AND of the number in registerreg
withval
.
and
operation is 1, otherwise it is 0. This is a very basic binary operation, and and
operations are very fast. The input we provide is “pinNum and 3110 = 111112“. This means that the last 5 bits of the answer will only contain 1, so it is definitely between 0 and 31. In particular, the position of 1 in the last 5 bits of pinNum
is 1. This is as if it were the remainder when divided by 32. This is not a coincidence, as 31 = 32 – 1.
Binary Division Example
teq pinVal,#0
.unreq pinVal
streq setBit,[gpioAddr,#40]
strne setBit,[gpioAddr,#28]
.unreq setBit
.unreq gpioAddr
pop {pc}
teq reg,#val
checks whether the number in registerreg
is equal toval
.
teq
(test for equality) is another comparison operation that can only test for equality. It is similar to cmp
, but it does not determine which number is larger. If you only want to test whether the numbers are the same, you can use teq
.setBit
at the GPIO address offset 40, which we already know will turn off that pin. Otherwise, we will store it at the GPIO address offset 28, which will turn on that pin. Finally, we return by popping pc
, which will set it to the value we saved when we pushed the link register.
5. A New Beginning
main.s
has become a bit large and more complex, it would be a good design to split it into two sections. The .init
we have been using should be kept as small as possible. We can change the code to reflect this easily._start:
:
b main
.section .text
main:
mov sp,#0x8000
makefile
and linker script that places the code in the .text
section (which is the default section) at the address of 800016 after the .init
section. This is the default loading address, and it provides us with some space to save the stack. Since the stack exists in memory, it also has an address. The stack grows downwards in memory, so each new value is lower than the previous address, making the top of the stack the lowest address.
Layout Diagram of Operating System
The position of the “ATAGs” section in the diagram retains information about the Raspberry Pi, such as how much memory it has and what the default screen resolution is.
pinNum .req r0
pinFunc .req r1
mov pinNum,#16
mov pinFunc,#1
bl SetGpioFunction
.unreq pinNum
.unreq pinFunc
pinNum .req r0
pinVal .req r1
mov pinNum,#16
mov pinVal,#0
bl SetGpio
.unreq pinNum
.unreq pinVal
mov pinVal,#1
, it will turn off the LED. Use the above code to replace your old code that turned off the LED.
6. Continue Moving Towards the Goal
Hopefully, you can successfully test all of this on your Raspberry Pi. So far, we have written a large amount of code, so errors are inevitable. If there are errors, you can check our troubleshooting page.
wait
function; currently, its timing control is not accurate, so we can better control our LED lights and ultimately control all GPIO pins.
via: https://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/ok03.html
Author: Robert Mullins[4] Topic: lujun9972 Translator: qhwdw Proofreader: wxy
This article is originally compiled by LCTT and honorably launched by Linux China