Understanding Raspberry Pi Screen Programming: Course 8

Screen 03 builds on Screen 02, teaching how to draw text and a small feature on operating system command line parameters.

— Alex Chadwick

Screen 03 builds on Screen 02, teaching you how to draw text and a small feature on operating system command line parameters. Assuming you already have the operating system code from Course 7: Screen 02[1], we will build upon it.

1. Theoretical Knowledge of Strings

Yes, our task is to draw text for this operating system. We have several issues to address, the most urgent one being how to save text. Incredibly, text is one of the biggest flaws on computers to date. What should be a simple data type has led to operating system crashes, undermining encryption in other areas and causing many problems for users with different alphabets. Nevertheless, it remains an extremely important data type, as it connects computers and users well. Text is a very well-structured format that computers can understand while also offering sufficient readability for humans.

So, how is text saved? Very simply, we use a method that assigns a unique number to each letter, and then we save a series of these numbers. It seems easy, right? The problem is that the number of those numbers is not fixed. Some text segments may be longer than others. When saving ordinary numbers, we have some inherent limitations: 32 bits, we cannot exceed this limit, and we need to add methods to use numbers of that length, etc. The term ‘text’ is often also called ‘string’, and we want to be able to write a function for variable-length strings, otherwise, we would need to write many functions! For general numbers, this is not a problem because there are only a few common number formats (byte, word, half-word, double-word).

Variable data types (like text) require complex processing.

So, how do we determine the length of a string? The obvious answer is to store the length of the string and then store the characters that make up the string. This is called a length prefix because the length is at the front of the string. Unfortunately, pioneers in computer science disagreed with this. They thought it made more sense to use a special character called a null terminator (NULL, represented by \0) to indicate the end of a string. This determination simplifies many string algorithms since you only need to keep operating until you encounter the null terminator. Unfortunately, this has become the source of many security issues. What happens if a malicious user gives you an especially long string? What happens if there is not enough space to save this especially long string? You can use a string copy function to copy until you encounter the null terminator, but what if the string is especially long and overwrites your program? This seems a bit pedantic, but buffer overflow attacks still occur frequently. Length prefixes can easily mitigate such problems because they can easily deduce the length of the buffer needed to store this string. As an operating system developer, I leave this question to you to decide how to better store text.

Buffer overflow attacks have plagued computers for a long time. Recently, Wii, Xbox, and Playstation 2, as well as large systems like Microsoft’s web and database servers, have all suffered from buffer overflow attacks.

The next thing we need to determine is how best to map characters to numbers. Fortunately, this is highly standardized, and we have two main choices: Unicode and ASCII. Unicode maps almost every useful symbol to a number, which comes at the cost of needing many, many numbers and a more complex encoding method. ASCII uses one byte for each character, so it only saves Latin letters, numbers, a few symbols, and a few special characters. Therefore, ASCII is very easy to implement, whereas each character in Unicode takes up different space, making string algorithms trickier. Generally, characters on operating systems use ASCII, not for display to end users (except for developers and expert users), but to display information to terminal users using Unicode, as Unicode can support things like Japanese characters and thus achieve localization.

Fortunately, we do not need to make a choice here because the first 128 characters are exactly the same, and the encoding is also exactly the same.

Table 1.1 ASCII/Unicode Symbols 0-127

< Swipe left and right if not fully displayed >

	0	1	2	3	4	5	6	7	8	9	a	b	c	d	e	f
00	NUL	SOH	STX	ETX	EOT	ENQ	ACK	BEL	BS	HT	LF	VT	FF	CR	SO	SI
10	DLE	DC1	DC2	DC3	DC4	NAK	SYN	ETB	CAN	EM	SUB	ESC	FS	GS	RS	US
20	!	“	#	$	%	&	.	(	)	*	+	,	–	.	/
30	0	1	2	3	4	5	6	7	8	9	:	;	<	=	>	?
40	@	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O
50	P	Q	R	S	T	U	V	W	X	Y	Z	[	\	]	^	_
60	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o
70	p	q	r	s	t	u	v	w	x	y	z	{	}	~	DEL

This table shows the first 128 symbols. The hexadecimal representation of a symbol is the row value plus the column value, for example, A is 41₁₆. You might be surprised by the first two rows and the last value. These 33 special characters are non-printable characters. In fact, many people overlook them. They exist because ASCII was originally designed as a method for computers to transmit data over networks. Therefore, the information it sends is not just symbols. You should learn about the important special characters NUL, which is the null terminator we mentioned earlier. HT horizontal tab is what we often refer to as tab, while LF line feed is used to generate a new line. You may want to research and use other special characters in your operating system.

2. Characters

So far, we have learned some knowledge about strings, and we can start thinking about how they are displayed. The most basic thing we need to do to display a string is to be able to display a character. Our first task is to write a DrawCharacter function, giving it a character to draw and a position, and then it will draw that character.

This naturally leads to a discussion about fonts. We already know there are many ways to display any given letter according to the selected font. So how do fonts work? In the early stages of computer science, fonts were just a series of small images of all letters, known as bitmap fonts, and all character drawing methods simply copied the images to the screen. When people wanted to adjust the font size, problems arose. Sometimes we need large letters, and sometimes we need small letters. Although we could draw new images for each font, size, and character, this approach is too tedious. Hence, vector fonts were invented. Vector fonts do not contain images of the font; they contain a description of how to draw the characters, for example, a o might be a circle drawn with a radius of half the maximum letter height. Modern operating systems almost exclusively use this type of font because they look perfect at any resolution.

The TrueType font format used in many operating systems is very powerful; it has its own assembly language built in to ensure letters look correct at any resolution.

Unfortunately, while I would love to include an implementation of a vector font format, the content is too much and would take up the remainder of this website. So we will implement a bitmap font, but if you want to create a decent graphical operating system, vector fonts will be very useful.

In the font section on the download page, we provide several .bin files. These are just raw binary data files of the fonts. To complete this tutorial, pick a monospaced, monochrome, 8×16 font from the section. Then download it and save it to the source directory and name it font.bin. These files are just monochrome images of each letter, each letter being exactly 8 x 16 pixels. Thus, each letter takes up 16 bytes, the first byte is the first row, the second byte is the second row, and so on.

Understanding Raspberry Pi Screen Programming: Course 8

This illustration shows the character A of the “Bitstream Vera Sans Mono” font in a monospaced, monochrome, 8×16 format. In this file, we can find the hexadecimal sequence starting from byte 41₁₆ × 10₁₆ = 410₁₆:

00, 00, 00, 10, 28, 28, 28, 44, 44, 7C, C6, 82, 00, 00, 00, 00

Here we will use a monospaced font because each character of a monospaced font is the same size. Unfortunately, the complexity of most fonts is due to their varying widths, which makes the display code more complex. The download page also includes several other fonts and an introduction to the storage format of this font.

Back to the topic. Copy the following code into drawing.s after graphicsAddress in .int 0.

.align 4
font:
.incbin "font.bin"

.incbin "file" inserts binary data from the file “file”.

Now we need to write the method to draw characters. Below I provide pseudocode, and you can try to implement it yourself. As a rule, >> means logical right shift.

function drawCharacter(r0 is character, r1 is x, r2 is y)
  if character &gt; 127 then exit
  set charAddress to font + character × 16
  for row = 0 to 15
  set bits to readByte(charAddress + row)
  for bit = 0 to 7
    if test(bits &gt;&gt; bit, 0x1)
    then setPixel(x + bit, y + row)
    next
  next
  return r0 = 8, r1 = 16
end function

If you implement it directly, this is clearly not an efficient approach. Efficiency is crucial for operations like drawing characters, as we will use it frequently. Let’s explore some improvements to make it optimized assembly code. First, because we have a × 16, you should immediately think that it is equivalent to a logical left shift by 4 bits. Next, we have a variable row, which only adds to charAddress and y. So we can eliminate it by increasing an alternate variable. Now the only question is how to determine when we are done. At this point, a useful .align 4 comes into play. We know that charAddress will start from a low byte that contains 0. This means we can check the low byte to see how far we have entered the character data.

While we can eliminate the need for bit, we must introduce new variables to achieve this, so it is best to keep it. The only remaining improvement is to remove the nested bits >> bit.

function drawCharacter(r0 is character, r1 is x, r2 is y)
  if character &gt; 127 then exit
  set charAddress to font + character &lt;&lt; 4
  loop
    set bits to readByte(charAddress)
    set bit to 8
    loop
      set bits to bits &lt;&lt; 1
      set bit to bit - 1
      if test(bits, 0x100)
      then setPixel(x + bit, y)
    until bit = 0
    set y to y + 1
    set charAddress to charAddress + 1
  until charAddress AND 0b1111 = 0
  return r0 = 8, r1 = 16
end function

Now we have code that is very close to assembly code, and it is optimized. Below is the assembly code for the above code.

.globl DrawCharacter
DrawCharacter:
cmp r0,#127
movhi r0,#0
movhi r1,#0
movhi pc,lr

push {r4,r5,r6,r7,r8,lr}
x .req r4
y .req r5
charAddr .req r6
mov x,r1
mov y,r2
ldr charAddr,=font
add charAddr, r0,lsl #4

lineLoop$:

  bits .req r7
  bit .req r8
  ldrb bits,[charAddr]
  mov bit,#8
  
  charPixelLoop$:
  
    subs bit,#1
    blt charPixelLoopEnd$
    lsl bits,#1
    tst bits,#0x100
    beq charPixelLoop$
    
    add r0,x,bit
    mov r1,y
    bl DrawPixel
    
    teq bit,#0
    bne charPixelLoop$
  
  charPixelLoopEnd$:
  .unreq bit
  .unreq bits
  add y,#1
  add charAddr,#1
  tst charAddr,#0b1111
  bne lineLoop$

.unreq x
.unreq y
.unreq charAddr

width .req r0
height .req r1
mov width,#8
mov height,#16

pop {r4,r5,r6,r7,r8,pc}
.unreq width
.unreq height

3. Strings

Now that we can draw characters, we can draw text. We need to write a method that takes a string as input and draws each character by incrementing the position. To do better, we should implement new lines and tabs. It is time to decide on the null terminator issue; if you want your operating system to use them, you can modify the following code as needed. To avoid this issue, I will pass the string length, the address of the string, and the coordinates x and y as parameters to the DrawString function.

function drawString(r0 is string, r1 is length, r2 is x, r3 is y)
  set x0 to x
  for pos = 0 to length - 1
    set char to loadByte(string + pos)
    set (cwidth, cheight) to DrawCharacter(char, x, y)
    if char = '\n' then
      set x to x0
      set y to y + cheight
    otherwise if char = '\t' then
      set x1 to x
      until x1 &gt; x0
        set x1 to x1 + 5 × cwidth
      loop
    set x to x1
    otherwise
      set x to x + cwidth
    end if
  next
end function

Again, this function has a lot of room for improvement. You can try to implement it directly or simplify it. I provide a simplified version of the function and assembly code below.

Clearly, the author of this function was not very efficient (surprised? It was written by me). Again, we have a pos variable that is used for incrementing and adding to other things, which is completely unnecessary. We can eliminate it and decrement the length until it reaches 0, thus saving a register. Apart from that annoying multiplication by 5, the rest of the function is quite good. An important thing to do here is to move the multiplication outside the loop; even using shift operations, multiplication is still slow, and since we always add a constant multiplied by 5, there is no need to recalculate it. In fact, in assembly code, it can be achieved by shifting parameters in one operand, so I changed the code to look like this.

function drawString(r0 is string, r1 is length, r2 is x, r3 is y)
  set x0 to x
  until length = 0
    set length to length - 1
    set char to loadByte(string)
    set (cwidth, cheight) to DrawCharacter(char, x, y)
    if char = '\n' then
      set x to x0
      set y to y + cheight
    otherwise if char = '\t' then
      set x1 to x
      set cwidth to cwidth + cwidth &lt;&lt; 2
      until x1 &gt; x0
        set x1 to x1 + cwidth
      loop
      set x to x1
    otherwise
      set x to x + cwidth
    end if
    set string to string + 1
  loop
end function

Here is its assembly code:

.globl DrawString
DrawString:
x .req r4
y .req r5
x0 .req r6
string .req r7
length .req r8
char .req r9
push {r4,r5,r6,r7,r8,r9,lr}

mov string,r0
mov x,r2
mov x0,x
mov y,r3
mov length,r1

stringLoop$:
  subs length,#1
  blt stringLoopEnd$
  
  ldrb char,[string]
  add string,#1
  
  mov r0,char
  mov r1,x
  mov r2,y
  bl DrawCharacter
  cwidth .req r0
  cheight .req r1
  
  teq char,#'\n'
  moveq x,x0
  addeq y,cheight
  beq stringLoop$
  
  teq char,#'\t'
  addne x,cwidth
  bne stringLoop$
  
  add cwidth, cwidth,lsl #2
  x1 .req r1
  mov x1,x0
  
  stringLoopTab$:
    add x1,cwidth
    cmp x,x1
    bge stringLoopTab$
  mov x,x1
  .unreq x1
  b stringLoop$
stringLoopEnd$:
.unreq cwidth
.unreq cheight

pop {r4,r5,r6,r7,r8,r9,pc}
.unreq x
.unreq y
.unreq x0
.unreq string
.unreq length

This code cleverly uses a new operation; subs subtracts one operand from another, saves the result, and then compares the result with 0. In implementation, all comparisons can be implemented as the result of subtraction compared to 0, but the result is often discarded. This means this operation is as fast as cmp.

subs reg,#val subtracts val from register reg, then compares the result with 0.

4. Your Will is My Command Line

Now that we can output strings, the challenge is to find an interesting string to draw. Generally, in such tutorials, people hope to draw “Hello World!”, but so far, although we can do it, I feel it has a bit of a “kingly” feel (if you like that feeling, feel free!). Therefore, as an alternative, we continue to draw our command line.

One limitation is that the operating system we are building is for ARM architecture computers. Most critically, when they boot, they provide some information to tell it what resources are available. Almost all processors have some way to determine this information, and on ARM, it is determined by data located at address 100₁₆, which has the following format:

1. The data is a decomposable series of labels.
2. There are nine types of labels: `core`, `mem`, `videotext`, `ramdisk`, `initrd2`, `serial`, `revision`, `videolfb`, `cmdline`.
3. Each label can only appear once, except for the `core` label, which is mandatory; the others are optional.
4. All labels are placed sequentially starting at address `0x100`.
5. The end of the label list is always two <ruby>words</ruby>, both of which are 0.
6. Each label's byte count is always a multiple of 4.
7. Each label starts with the size of the label (in words).
8. Immediately following is a half-word containing the label number. The numbering starts from 1 for `core` and goes up to 9 for `cmdline`.
9. Following that is a half-word containing 5441<sub>16</sub>.
10. After that is the label data, which varies depending on the label. The sum of the size (in words) + 2 is always equal to the length mentioned earlier.
11. A `core` label can be either 2 words long (indicating no data) or 5 words long (indicating it has 3 words of data).
12. A `mem` label is always 4 words long. The data is the first address of the memory block and the length of the memory block.
13. A `cmdline` label contains a `null` terminated string, which is a kernel parameter.

In the current version of the Raspberry Pi, only the core, mem, and cmdline labels are provided. You can find their usage later, and more comprehensive references are on the Raspberry Pi reference page. Now, we are interested in the cmdline label since it contains a string. We will continue to write some code to search for this command line (cmdline) label, and if found, output it in a new line for each entry. The command line is just a list of things that the graphical processor or user thinks the operating system should know. On the Raspberry Pi, this includes the MAC address, serial number, and screen resolution. The string itself is also a space-separated expression (like key.subkey=value).

Almost all operating systems support a “command line” program. The idea is to provide a generic mechanism for specifying the expected behavior when selecting a program.

We start by looking for the cmdline label. Copy the following code into a new file named tags.s.

.section .data
tag_core: .int 0
tag_mem: .int 0
tag_videotext: .int 0
tag_ramdisk: .int 0
tag_initrd2: .int 0
tag_serial: .int 0
tag_revision: .int 0
tag_videolfb: .int 0
tag_cmdline: .int 0

Finding a label through the label list is a slow operation since it involves many memory accesses. Therefore, we only want to do it once. This code creates some data to save the memory address of the first label of each type. Next, the following pseudocode can be used to find a label.

function FindTag(r0 is tag)
  if tag &gt; 9 or tag = 0 then return 0
  set tagAddr to loadWord(tag_core + (tag - 1) × 4)
  if not tagAddr = 0 then return tagAddr
  if readWord(tag_core) = 0 then return 0
  set tagAddr to 0x100
  loop forever
    set tagIndex to readHalfWord(tagAddr + 4)
    if tagIndex = 0 then return FindTag(tag)
    if readWord(tag_core+(tagIndex-1)×4) = 0
    then storeWord(tagAddr, tag_core+(tagIndex-1)×4)
    set tagAddr to tagAddr + loadWord(tagAddr) × 4
  end loop
end function

This code is already optimized and is very close to assembly. It tries to load the label directly, being a bit optimistic the first time, but all other situations can do this. If it fails, it will check if the core label has an address. Because the core label is mandatory, if it does not have an address, the only possible reason is that it does not exist. If it has an address, then we have not found the label we are looking for. If it has not been found, we need to find the address of all labels. This is done by reading the label number. If the label number is 0, it means we have reached the end of the label list. This means we have searched through all the labels in the directory. So if we run our function again, it should now be able to give an answer. If the label number is not 0, we check if this label type already has an address. If not, we save the address of this label in the directory. Then increase the label length (in bytes) to the label address and continue searching for the next label.

Try to implement this code in assembly. You will need to simplify it. If you get stuck, below is my answer. Don’t forget .section .text!

.section .text
.globl FindTag
FindTag:
tag .req r0
tagList .req r1
tagAddr .req r2

sub tag,#1
cmp tag,#8
movhi tag,#0
movhi pc,lr

ldr tagList,=tag_core
tagReturn$:
add tagAddr,tagList, tag,lsl #2
ldr tagAddr,[tagAddr]

teq tagAddr,#0
movne r0,tagAddr
movne pc,lr

ldr tagAddr,[tagList]
teq tagAddr,#0
movne r0,#0
movne pc,lr

mov tagAddr,#0x100
push {r4}
tagIndex .req r3
oldAddr .req r4
tagLoop$:
ldrh tagIndex,[tagAddr,#4]
subs tagIndex,#1
poplt {r4}
blt tagReturn$

add tagIndex,tagList, tagIndex,lsl #2
ldr oldAddr,[tagIndex]
teq oldAddr,#0
.unreq oldAddr
streq tagAddr,[tagIndex]

ldr tagIndex,[tagAddr]
add tagAddr, tagIndex,lsl #2
b tagLoop$

.unreq tag
.unreq tagList
.unreq tagAddr
.unreq tagIndex

5. Hello World

Now that we have everything in place, we can draw our first string. In the main.s file, delete all the code after bl SetGraphicsAddress and replace it with the code below:

mov r0,#9
bl FindTag
ldr r1,[r0]
lsl r1,#2
sub r1,#8
add r0,#8
mov r2,#0
mov r3,#0
bl DrawString
loop$:
b loop$

This code simply uses our FindTag method to look for the 9th label (cmdline), then calculates its length, and passes the command and length to the DrawString method, telling it to draw the string at 0,0. Now you can test it on the Raspberry Pi. You should see a line of text on the screen. If not, check our troubleshooting page.

If everything is working, congratulations, you have been able to draw text. But there is still much room for improvement. If you want to write a number, or a part of memory, or operate our command line, how should you do it? In Course 9: Screen 04[2], we will learn how to manipulate text and display useful numbers and information.

via: https://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/screen03.html

Author: Alex Chadwick[4] Editor: lujun9972 Translator: qhwdw Proofreader: wxy

This article is originally compiled by LCTT and proudly presented by Linux China

Understanding Raspberry Pi Screen Programming: Course 8

Related posts

Leave a Comment Cancel reply