Getting Started with ZYNQ: Asymmetric Multi-Core Processing (AMP)

In all the previous articles, we only used one ARM Cortex-A9 processor core (Core 0). However, the PS side contains two processor cores, and for many applications, we want to use both Zynq cores simultaneously for optimal performance. Using both Zynq processor cores for different tasks is called Asymmetric MultiProcessing (AMP), and can involve any of the following combinations:
-
Running different operating systems on Core 0 and Core 1 -
Running an operating system on Core 0 and bare-metal code on Core 1 (and vice versa) -
Executing different programs with bare-metal code on both cores
Introduction to AMP
There are two types of multi-core processing: symmetric and asymmetric. Before defining the differences between the two, we must first define what multi-core processing is: “Multi-core processing is the use of multiple processors in a system, allowing multiple instructions to be executed simultaneously. However, this is not necessarily the case.” The difference between symmetric and asymmetric multi-core processing is
-
Symmetric multi-core processing runs multiple software tasks simultaneously by distributing processing across multiple microprocessor cores -
Asymmetric multi-processing uses dedicated processors to run specific applications or runs dedicated applications on the same processor
In the next few blog posts, we will explore AMP on the Zynq SoC. First, we will examine two bare-metal applications, each running on different cores. When running AMP on the Zynq SoC, it is essential to consider that the Zynq processor cores mix private and shared resources. Both processors have private L1 instruction and data caches, timers, and watchdogs, as well as a shared interrupt controller (shared and private interrupts). However, interrupts on Zynq are not so simple, as each core in the PS can use software interrupts to interrupt itself, another processor, or both processors, with these interrupts being dispatched through the interrupt controller.
The Zynq SoC also has many shared resources, common examples include I/O peripherals, on-chip memory, interrupt controller distributors, L2 caches, and system memory located in DDR memory. The figure below shows some of these resources.

We will run two processor cores from DDR memory, so we must carefully segment the address regions used by each processor. The addresses are determined by the link description files of each application. If we do not handle this properly, applications running on different cores may interfere with each other’s operations.
We must also modify the files automatically generated by the SDK to get the system to boot and run. The first step will modify the first-stage bootloader based on XAPP1079 (http://www.xilinx.com/support/documentation/application_notes/xapp1079-amp-bare-metal-cortex-a9.pdf), which will check for bare-metal AMP.
I initially intended to create a very simple system, and once it booted and ran, we could expand on it. The first application will allow the processor Core 0 of the Zynq SoC to communicate with the user via RS232, while Core 1 drives the LEDs connected to the ZYNQ I/O. These two applications can run simultaneously without interaction.
Booting and Running AMP
While booting and running AMP involves several steps, it is actually a very straightforward process, and there’s nothing to be afraid of.
The key aspect of getting AMP running on the Zynq SoC is the bootloader, which looks for the second executable file after loading the first executable into memory. To simplify, I will use the modified FSBL and modified standalone operating system provided in the Xilinx application note XAPP1079 (source files can be obtained here: http://www.xilinx.com/support/documentation/application_notes/xapp1079-amp-bare-metal-cortex-a9.pdf).
After downloading the zip file, the first step is to extract the compressed file to the desired working directory and rename the folder named SRC to design. These files contain the modified FSBL and modified standalone operating system. We need the SDK to recognize these files, so the next step is to update the SDK repository to inform the SDK of their existence. Under the Xilinx tool menu in the SDK, select repository, then choose new, and navigate to the directory location <working directory>ootgenootgen
epo as shown below:

After adding the repository, the next stage is to generate the following:
-
AMP first-stage bootloader -
Core 0 application -
Core 1 application
We will generate a BSP (Board Support Package) for each core.
The first thing to do is create a new FSBL. Choose file -> new application -> project, which allows us to create an FSBL project that supports AMP. This is similar to what we did before, but we will choose the Zynq FSBL for AMP template instead of the Zynq FSBL template.

After creating the AMP FSBL, we need to create an application for the first core. This is simple, as we have done it many times before. Ensure to select Core 0 and standalone operating system and allow it to create its own BSP.

Once we have created this application, we need to correctly define where in DDR memory the application will execute. To do this, we edit the link description file to show the base address and size of DDR. This is crucial; if we do not properly segment the DDR memory for the applications of Core 0 and Core 1, we risk unintentionally corrupting each other’s applications.

We can now write the application we want to execute on Core 0. We need to include the following code section in the application.

This code disables the cache on the Zynq SoC on-chip memory and writes the starting address of the Core 1 program to the address that Core 1 will access after executing the Set Event (SEV) command on Core 0. The SEV command starts Core 1 executing its program.
The next step is to create the BSP for Core 1. We want to use the modified standalone operating system (standalone_amp), which prevents the reinitialization of the PS Snoop Control Unit (SCU). Therefore, we cannot allow automatic BSP generation when creating the project as we did for Core 0. Be sure to select Core 1 in the CPU selection options.

Now that we have created the BSP for Core 1, we need to modify the BSP settings before continuing to create the application that will run on Core 1. This is very simple; we need to add an additional compiler flag -DUSE_AMP=1 to the BSP’s driver section configuration:

Once completed, we can freely create the application for Core 1. Ensure to select Core 1 as the processor and use the BSP we just created:

Similarly, after creating the new application, we need to correctly define the memory location in DDR memory where the Core 1 program will execute. This is done by editing the linker script for the Core 1 application:

Like the application for the first core, we must also disable the cache on the on-chip memory, as we will use this memory for communication between the two processors in future blogs. Once we have completed our application and built the project, we should now have the following:
-
AMP FSBL ELF -
Core 0 ELF -
Core 1 ELF -
Bit file defining device configuration.
We now need a .bin file to allow the Zynq SoC to boot from the selected configuration memory. We also need a bif file that defines the files used to create the bin and specify the order of the files.
We will use the bat file provided under directoryootgenootgenootgen instead of creating the Zynq boot image in the SDK. This directory contains a bif file and a cpu1_bootvec.bin file that serves as part of the modified FSBL to prevent it from looking for more applications to load.
To generate the bin file, we will copy the three generated ELF files to the bootgen directory and edit the BIF file to ensure the elf names in the bif file are correct.

We can now open the ISE/Vivado command prompt, navigate to the bootgen directory, and run createboot.bat, which will create the boot.bin file:

This file can then be downloaded to the flash on the Zynq SoC. The boot device will run both cores and execute their respective programs.
In the next section, we will discuss some details.

The previous section created simple software that starts and runs on both cores. Its simplicity allowed me to demonstrate how to enable communication between the two Zynq SoC processor cores via OCM (on-chip memory). However, the software running on the two cores is currently doing simple tasks, so we have a baseline to move forward.
-
Core 0 is the master controller and controls the execution of Core 1. It also prints messages to the terminal program using UART.
-
Once started by Core 0, Core 1 initializes its private resources and drives eight LEDs. We need to use Core 1’s private timer and enable interrupts through the GIC.
These applications have no associations and do not share resources. However, real applications will want to be able to do that. The application running on Core 0 is very simple. It starts the software on Core 1 and prints a simple message in a loop using UART 0:

However, we plan to use Core 1’s interrupt controller, so we must first configure the GIC (Generic Interrupt Controller) using the following code:

Core 1’s code must be slightly more complex because we use the GPIO module on the PL (Programmable Logic) side of the Zynq SoC to drive the LEDs on the ZYNQ. As with all other interfaces from Xilinx, the standalone operating system provides a simple set of drivers for this via #include “xgpio.h”, which is slightly different from the xgpio_ps.h file we previously used to drive the MIO/EMIO GPIO on the PS (Processing System) side of the Zynq SoC. However, in this case, I want to demonstrate how to use GPIO on the PL side of the Zynq SoC. To ensure we can see the LED toggling, we will use Core 1’s private timer, which is the same as we used previously on Core 0.

Before Core 1’s program starts executing its main application, we need to disable the cache on the on-chip memory (OCM), initialize the GPIO, initialize the private timer, and configure the interrupt controller so that we can use the interrupt from the private timer to toggle the LED. We can now start writing a fairly simple interrupt service routine that toggles the LED when the private timer expires and restarts. This process will continue indefinitely.

Here are the results of the program execution, as reported by Core 0 to the terminal window:

Code Address
https://github.com/suisuisi/zynq_guide/blob/main/core_0_main_part50.c

Getting Started with ZYNQ: MIO

Getting Started with ZYNQ: PS-side GPIO

Getting Started with ZYNQ: Interrupts (Part 1)

Getting Started with ZYNQ: Interrupts (Part 2)

Getting Started with ZYNQ: Dedicated Timer

Getting Started with ZYNQ: Dedicated Watchdog

Getting Started with ZYNQ: Triple Timer Counter (TTC)

Getting Started with ZYNQ: PS and PL Interaction

Getting Started with ZYNQ: DMA

Getting Started with ZYNQ: Operating System uC/OS