The History of ARM (Part One): Creating the First Chip

In 1983, Acorn Computers was at its peak. Unfortunately, troubles were also on the horizon.

This small British company became famous for winning a contract with the BBC to produce computers for a national television program. The sales of its BBC Micro computers skyrocketed, with expectations to surpass 1.2 million units.

The History of ARM (Part One): Creating the First Chip — BBC Micro magazine advertisement. The slogan was “The Shape of Things to Come.”

However, the world of personal computers was changing. The market for cheap 8-bit microcomputers that parents used to help their children with homework was becoming saturated. New computers from across the ocean, such as the IBM PC and the soon-to-be-released Apple Macintosh, promised more powerful performance and ease of use. Acorn needed to find a way to compete, but it lacked sufficient funds for research and development.

A Seed of an Idea

Sophie Wilson, one of the designers of the BBC Micro, foresaw this problem. She added a slot called “Tube” that could connect to a more powerful central processing unit (CPU). The CPU in the slot could take over the computer, freeing up the original 6502 chip to perform other tasks.

But which processor should she choose? Wilson and co-designer Steve Furber considered various 16-bit processors, such as Intel’s 80286, National Semiconductor’s 32016, and Motorola’s 68000. However, none of the processors were entirely satisfactory.

In a later interview with the Computer History Museum, Wilson explained, “We could see the strengths and weaknesses of all these processors. The first weakness was that they did not fully utilize the memory system. The second weakness was that they were not fast enough and were not very user-friendly. We were used to programming the 6502 processor in machine code, and we wanted to reach a level where we could achieve the same results even when programming in higher-level languages.”

But what other options were there? For the small Acorn, was it really feasible to create its own CPU from scratch? To find out, Wilson and Furber visited the National Semiconductor factory in Israel. They saw hundreds of engineers and a lot of expensive equipment. This confirmed their suspicions: the task might be beyond their capabilities.

Then, they visited the Western Design Center in Mesa, Arizona. This company was producing the beloved 6502 and designing a 16-bit successor—the 65C618. Wilson and Furber found that it was just a “suburban bungalow” with only a few engineers and some students working with old Apple II computers and some tape to draw diagrams.

Suddenly, making their own CPU seemed possible. Wilson and Furber’s small team had previously developed custom chips, such as the graphics chip and input/output chip for the BBC Micro. But these designs were simpler than a CPU and had fewer components.

Despite the challenges, Acorn’s upper management continued to support their efforts. In fact, they did much more than that. Hermann Hauser, co-founder of Acorn and a physicist, provided the team with a copy of an IBM research paper that described a new, more powerful CPU. It was called RISC, which stands for “Reduced Instruction Set Computing.”

Adopting RISC

What does this mean? To answer this question, let’s quickly review how a CPU works. First, starting with transistors, a transistor is a tiny sandwich-like device made of silicon and various chemicals. A transistor has three connectors. When a voltage is applied to the gate input, current flows freely from the source to the drain. When there is no voltage at the gate, the current stops flowing. Thus, a transistor acts like a controllable switch.

You can combine transistors to form logic gates. For example, two switches in series form an “AND” gate, while two switches in parallel form an “OR” gate. These gates allow computers to make decisions by comparing numbers.

But how are numbers represented? Computers use binary (Base 2), where a small positive voltage represents 1 and zero voltage represents 0. These 1s and 0s are called bits. Because binary operations are very simple, it is easy to create a binary adder that can add 0 or 1 with 0 or 1 and store the sum and an optional carry. Numbers greater than 1 can be represented by adding more adders that work simultaneously. The number of binary digits that can be accessed simultaneously is a measure of the chip’s “bitness.” An 8-bit CPU like the 6502 processes numbers in 8-bit chunks.

Arithmetic and logic operations are essential components of a CPU. But humans need a way to tell it what to do. Therefore, every CPU has an instruction set that lists all the ways the CPU can move data in and out of memory, perform mathematical calculations, compare numbers, and jump to different parts of the program.

The idea behind RISC is to significantly reduce the number of instructions, thereby simplifying the internal design of the CPU. To what extent? The Intel 80286 is a 16-bit chip with a total of 357 unique instructions. In contrast, the new RISC instruction set created by Sophie Wilson has only 45 instructions.

To achieve this simplification, Wilson adopted a “load and store” architecture. Traditional (complex) CPUs have instructions that differ, such as adding numbers from two internal “registers” (small memory blocks inside the chip), adding numbers from two addresses in external memory, or combining both. In contrast, RISC chip instructions can only operate on registers. Afterward, a separate instruction is needed to move the answer from the register to external memory.

This means that RISC CPU programs typically require more instructions to produce the same results. So how can they be faster? One answer is that simpler designs can run at higher clock speeds. But another reason is that executing more complex instructions takes longer. By keeping instructions simple, each instruction can be executed in one clock cycle. This makes pipelining techniques easier to implement.

Typically, a CPU must process instructions in stages. It needs to fetch the instruction from memory, decode the instruction, and then execute the instruction. The RISC CPU designed by Acorn has a three-stage pipeline. While one part of the chip executes the current instruction, another part fetches the next instruction, and so on.

A disadvantage of RISC design is that because programs require more instructions, they occupy more memory space. Back in the late 1970s, when the first generation of CPU designs was created, the cost of 1MB of memory was about $5,000. Therefore, any method that could reduce program memory usage (which complex instruction sets help achieve) was highly valuable. This is why chips like the Intel 8080, 8088, and 80286 had so many instructions.

But memory prices were rapidly declining. By 1994, the price of 1MB had dropped below $6. Therefore, the additional memory required by RISC CPUs would not be a problem in the future.

To further ensure the future development of the new Acorn CPU, the team decided to skip 16-bit and go directly to a 32-bit design. This actually simplified the internal structure of the chip, as there was no need to frequently split large numbers, and it allowed direct access to all memory addresses. (In fact, the first chip only exposed 26 of the 32 address lines, as 2 to the power of 26, or 64MB, was quite an extravagant memory capacity at the time.)

The team now just needed to name the new CPU. They considered various options and ultimately named it the Acorn RISC Machine, abbreviated as ARM.

By Arm and Prayer

The development of the first ARM chip took 18 months. To save costs, the team spent a lot of time testing before putting the design into chip production. Furber wrote an ARM CPU simulator in interpreted BASIC on the BBC Micro. Of course, this was very slow, but it helped validate the concept and confirm that Wilson’s instruction set could operate as designed.

According to Wilson, the development process was ambitious yet straightforward.

“We felt we were crazy,” she said, “We felt we couldn’t do it. But we kept finding that there was no stopping point. It was just a matter of continuing to work.”

Furber was responsible for most of the layout and design of the chip itself, while Wilson focused on the instruction set. However, in reality, these two tasks were closely linked. Choosing code numbers for each instruction was not arbitrary. Each number was carefully considered so that when it was converted to binary, the corresponding lines on the instruction bus could activate the correct decoding and routing circuits.

As the testing process matured, Wilson led the team in writing more advanced simulators. “With a pure instruction simulator, we could run programs at hundreds of thousands of ARM instructions per second on a 6502 processor,” she explained. “We could write a lot of software, port BBC BASIC to ARM, and everything else, including a second processor and operating system. This gave us a lot of confidence. Even when we were interpreting ARM machine code, some of the programs ran better than anything we had seen before. The performance of ARM machine code itself was so high that on the same platform, the interpreted ARM machine code often performed better than compiled code.”

These amazing results motivated the small team to complete the work. The design of the first ARM CPU was sent to VLSI Technology Inc. in the United States for production. On April 26, 1985, the first version of the chip returned to Acorn. Wilson inserted it into the Tube slot of the BBC Micro, loaded the ported ARM version of BBC BASIC, and tested it with a special PRINT command. The chip responded, “Hello World, I am ARM,” and the team opened a bottle of champagne to celebrate.

Let’s take a moment to reflect on what an amazing achievement this was. The entire ARM design team consisted of Sophie Wilson, Steve Furber, a few chip designers, and a four-person group writing testing and validation software. This brand new 32-bit CPU based on advanced RISC design was developed by fewer than 10 people and ran correctly on its first attempt. In contrast, National Semiconductor had developed the 10th version of the 32016 and was still discovering bugs.

How did the Acorn team accomplish this? They designed ARM to be as simple as possible. The V1 chip had only 27,000 transistors (the 80286 had 134,000!), and it was manufactured using a 3-micron process—3,000 nanometers, about a thousand times finer than today’s CPUs.

At this level of detail, you can almost identify individual transistors. For example, look at the register file and compare it with this interactive diagram on how random access memory works. You can see the instruction bus transmitting data from the input pins and routing it to the decoder and register controller.

While the first ARM CPU was impressive, it is also important to point out what it lacked. It had no onboard cache. It had no multiplication or division circuits. It also lacked a floating-point unit, so non-integer operations were slower than they should have been. However, using a simple barrel shifter helped handle floating-point numbers. The chip’s operating frequency was quite modest, at only 6 MHz.

So how did this brave ARM V1 processor perform? Benchmark tests showed that at the same clock speed, it was about 10 times faster than the Intel 80286, equivalent to a 32-bit Motorola 68020 running at 17 MHz.

The design of the ARM chip also considered extremely low power consumption. Wilson explained that this was entirely to save costs—the team wanted to use plastic packaging instead of ceramic packaging for the chip, so they set a maximum power consumption target of 1 watt.

But the tools they had at the time for estimating power consumption were very primitive. To ensure they did not exceed the limit and melt the plastic, they were very careful with every design detail. Due to the simple design and lower clock frequency, the actual power consumption ended up being only 0.1 watts.

In fact, the team initially inserted the ARM into a test board, and one of the connections was disconnected, not connected to any power source at all. When they discovered the fault, they were shocked because the CPU had been working. It was just powered by leakage from the support chips.

Wilson stated that the extremely low power consumption of the ARM chip was entirely “accidental,” but it later became very important.

Arming New Computers

So Acorn had this amazing technology, years ahead of its competitors. Surely financial success would follow soon, right? Well, if you know the history of computing, you can probably guess the answer.

By 1985, sales of the BBC Micro began to decline, pressured on one side by cheap Sinclair Spectrums and on the other by IBM PC clones. Acorn sold its controlling stake in the company to Olivetti, which had previously partnered with Olivetti to produce printers for the BBC Micro. Generally speaking, if you sell your computer company to a typewriter company, it’s not a good sign.

Acorn sold development boards featuring the ARM chip to researchers and enthusiasts, but only to existing BBC Micro users. What the company needed was a brand new computer to truly showcase the power of this new CPU.

Before achieving this goal, Acorn needed to make some upgrades to the original ARM architecture. ARM V2 was released in 1986, adding support for coprocessors (such as floating-point coprocessors, which were popular add-ons at the time) and built-in hardware multiplication circuits. It was manufactured using a 2-micron process, allowing Acorn to increase the clock frequency to 8 MHz without consuming more power.

But just having a CPU was not enough to build a complete computer. Therefore, the team built graphics controller chips, input/output controllers, and memory controllers. By 1987, all four chips, including ARM V2, were ready, along with a prototype computer to house these chips. To reflect its advanced thinking, the company named it the Acorn Archimedes.

By 1987, personal computers were expected to do much more than just input BASIC commands. Users demanded beautiful graphical user interfaces like those of the Amiga, Atari ST, and Macintosh.

Acorn established a remote software development team in Palo Alto, California (home of Xerox PARC) to design the next-generation operating system for the Archimedes. This system was called ARX, promising support for preemptive multitasking and multi-user capabilities. ARX was slow, but the bigger problem was that it was released too late. Very late.

The Acorn Archimedes was about to launch, but the company did not have a suitable operating system. The situation was critical. So Acorn management contacted Paul Fellows, head of the Acornsoft team, who had written various programming languages for the BBC Micro. They asked him, “Can you and your team write and release an operating system for the Archimedes in five months?”

Fellows replied, “I was a fool at the time to say, ‘Yes, we can do it.'”

Five months is not a long time to develop an operating system from scratch. This rushed operating system was called “Arthur,” possibly named after the famous British computer scientist Arthur Norman, or it could be an acronym for “ARm by THURsday!” It was initially an extension of the BBC BASIC language. Richard Mamby wrote a program called “Arthur Desktop” in BASIC just to demonstrate the functionality of the window manager developed by the team. However, due to the time constraints, this demo program was burned into the ROM of the first computers.

The first Archimedes computers were launched in June 1987, some still bearing the BBC logo. These computers were indeed fast and offered great value—launching at £800, equivalent to about $1,300 at the time. In contrast, the Macintosh II, priced at $5,500 in 1987, had comparable computing power.

However, the Macintosh had PageMaker, Microsoft Word, Excel, and a host of other useful software. The Archimedes was a new computer platform with limited software available at launch. The computing world was rapidly gravitating towards IBM PC compatibles and Macintosh (and later Amigas), leaving all other computers squeezed out of the market. The Archimedes received positive reviews in the British media and garnered a passionate following, but in its early years, sales were under 100,000 units.

Seeds of Growth

Acorn quickly fixed the bugs in Arthur and set about developing a replacement operating system with more modern features, RISC OS. RISC OS was released in 1989, shortly after the new version of the ARM CPU, V3.

The V3 chip was manufactured using a 1.5-micron process, shrinking the ARM2 core to about a quarter of the usable chip space, allowing room for a 4 KB fast level 1 cache. The clock speed was also increased to 25 MHz.

While these improvements were impressive, engineers like Sophie Wilson believed there was still room for further development of the ARM chip. However, Acorn’s resources were rapidly dwindling, and capabilities were limited. To realize these dreams, the ARM team needed to seek external investors.

Just then, a representative from another computer company named after a popular fruit walked in.

A Seed of an Idea

Adopting RISC

By Arm and Prayer

Arming New Computers

Seeds of Growth

Related posts

Leave a Comment Cancel reply