Embedded AI: From Basics to Advanced Applications with K210

Don’t miss my updates, remember to click the top right corner – view public account – set as a star! Take down the stars for me ⭐️!
Embedded AI: From Basics to Advanced Applications with K210
I am ZhiHui, a regular contributor of “Darwen Says”, bringing cutting-edge knowledge of artificial intelligence to everyone from time to time!Scroll down to the end of the article to see the series of articles I shared on “Darwen Says”.
Now let’s officially start today’s sharing:
Introduction

Those who follow ZhiHui know that I am not picky when it comes to projects, I just grab whatever is available and am good at finding trouble for myself.

Since my main job is AI, in a recent project, I made a small module using STM32 to explore the question of “what computational power is minimally required to infer a practical neural network”. The demonstration video of the project is as follows:

In that project, I successfully inferred a self-trained CNN-SLR network model on a microcontroller with a main frequency of only less than 100MHz and sram of only 32K, and the results were quite good.

This time, I plan to do the opposite: to use the most powerful MCU possible to see how complex a network can run on it, the K210.
Note that I specifically mentioned using an MCU here, so various SoCs on mobile devices are not discussed; the reason is that the battle between mobile SoCs is too fierce, and this discussion will be reserved for later. On the other hand, since the two are not in the same price range, there is no comparability.

To introduce the MCU model I will use in this article, let’s first discuss a conservation law of embedded processors, which is that power consumption – price – performance cannot be achieved simultaneously, as shown in the figure below:

Embedded AI: From Basics to Advanced Applications with K210

“There are no free processors in the world”
To explain the words in the figure briefly, due to physical laws, we can assume that under constant conditions, performance and power consumption are definitely proportional; stronger performance means greater power consumption and thus more heat will be generated; if we want to improve performance while ensuring power consumption remains unchanged or even lower, we can only improve the chip architecture or improve the process technology, which will also lead to increased costs. Therefore, based on different application directions, we can choose some hardware platforms that balance two of these factors:
  • Low cost, weak performance, and low power consumption: represented by various microcontrollers like STM32/AVR.

  • Low cost, strong performance, but low process and prone to heat: represented by various low-end SoCs like Allwinner’s H series.

  • Strong performance, advanced process, but extremely unfriendly prices; having money doesn’t guarantee a sale: represented by various flagship SoCs used in smart phones today, such as Qualcomm Snapdragon/Huawei Kirin/Apple A series.

Is there really no chip platform that can achieve all three?
Actually, there is; the main character of this article, K210, is barely one, which is also the main reason why I chose to play with this project. Now, let me analyze it for you~
What is K210?
K210 is an MCU launched last year by a company called Canaan, which used to make mining chips. Its feature is that the chip architecture includes a self-developed neural network hardware accelerator KPU, which can perform convolutional neural network operations with high performance.
Embedded AI: From Basics to Advanced Applications with K210
Main parameters of K210:
Don’t think that the performance of MCUs is necessarily inferior to high-end SoCs; at least in terms of AI computing, K210’s computing power is quite considerable. According to Canaan’s official description, K210’s KPU computing power is 0.8TFLOPS; for comparison, the computing power of NVIDIA’s Jetson Nano with 128 CUDA cores is 0.47TFLOPS; while the latest Raspberry Pi 4 has less than 0.1TFLOPS.
Of course, this performance is still inferior to some flagship-level SoCs: A76-level CPUs are already very powerful, let alone flagship SoCs that will carry hardware for AI acceleration for heterogeneous computing, such as Qualcomm’s Hexagon DSP, Apple’s Neural Engine, Huawei’s Da Vinci architecture NPU, etc. These NPUs can even achieve computing power close to that of desktop GPUs with hundreds of watts of power consumption in certain applications (by the way, the double precision floating point computing power of GTX1080Ti is 11.3 TFLOPS), which can be said to be insane.
It is worth noting that the chip’s computing power does not necessarily correlate with the model inference speed; another core of embedded AI is the inference framework. For CPU architectures, whether to use SIMD (ARM has supported NEON instructions since v7), whether to use multi-core multi-threading, whether there is an efficient convolution implementation, whether there is assembly optimization, etc., will greatly affect the model running speed; while for hardware architectures like DSP/NPU, whether to perform quantized inference on the model, the method of quantization, and memory access optimization will also have a significant impact.
Other parameters of K210 are as follows:
  • Dual-core 64-bit RISC-V RV64IMAFDC (RV64GC) CPU / 400MHz (can be overclocked to 600MHz)

  • Double precision FPU

  • 8MiB 64bit on-chip SRAM (6MiB general SRAM + 2MiB AI dedicated SRAM)

  • Neural network processor (KPU) / 0.8TFLOPS

  • Audio processor (APU)

  • Programmable IO array (FPIOA)

  • Dual hardware 512-point 16-bit complex FFT

  • Support for SPI, I2C, UART, I2S, RTC, PWM, timers

  • AES, SHA256 accelerators

  • Direct memory access controller (DMAC)

The chip adopts BGA144 packaging, 28nm process technology, and the chip power consumption is as low as 0.35W, while the cost is only 20RMB.
Just think about how much an Arduino board costs… The performance and peripherals of Arduino are just a little brother in front of K210 (shrugs).

Embedded AI: From Basics to Advanced Applications with K210

So what can K210 do?

So much that you might doubt whether it really is just a microcontroller… face detection, object recognition, video playback, sound field imaging, 3D rendering, and it can even run an FC emulator to play games…
Alright, enough boasting about the chip, let’s introduce some K210 development boards already available on the market.
How to choose a development board:
If you are interested in K210 after reading the above introduction, I will recommend a few K210 development boards worth getting (if I accidentally advertised, please settle the advertising fee :D).
Actually, I received a K210 development board a long time ago as a testing kit, but at that time I had just graduated and didn’t have much time to tinker, so I kept putting it off until recently when I finally had time to study it carefully. Another reason is that when K210 was just released, the software ecosystem was still very immature, but now it has improved a lot, not only is there an official IDE (still in development version), but there are also firmware ports like MicroPython, which greatly increases playability and is worth tinkering with.
I personally plan to design a K210 development board later (already started), to meet my personal various bizarre requirements (mainly to be dog mini). However, for ordinary players, the BGA chip may not be suitable for DIY, so you can just buy according to my recommendations below, many of which I have personally bought and tried.
1) Official development board kit – KD233

Embedded AI: From Basics to Advanced Applications with K210

This is the evaluation board kit from Canaan, the advantage is that it can seamlessly connect with the example projects in the official IDE (of course, if you don’t plan to use the official IDE, it doesn’t matter; this will be introduced in the software section later), the examples downloaded from the IDE can be burned and run directly. At the same time, the resources on the board are quite abundant, with a camera, microphone, LCD screen, etc., and you can see that there is a row of jumper caps around the chip, which brings out all the IOs, and you can also remove the jumper caps to connect other hardware with Dupont wires.

The disadvantages are also obvious, this board is just too big….. I just can’t stand it, although I bought it, but I only use it to verify whether the official schematic is correct, and I still use other boards for debugging.
The price is relatively high, suitable for those who want to design hardware themselves, others are not recommended.
2) Widora-AIRV2/BITK210 development board kit

Embedded AI: From Basics to Advanced Applications with K210

This is currently the smallest K210 development board available on the market, and the price is very affordable.

This development board adopts a design method of separating the core board and the baseboard, suitable for those who want to do hardware but cannot design the core circuit. The core board uses NGFF (Mini-PCIE) interface, integrating the minimum system of the chip and the power IC on the core board, while the remaining functions are all brought out to the baseboard by gold fingers.
You can see that there are many filter capacitors arranged on the back of the core board right under K210, because at such a high operating frequency, the stability of the power supply has very high requirements, basically designed the PCB according to ARM CPU’s layout method.
The board also comes with a camera and LCD, but the model of the default camera is slightly different from the official development board; the official one is OV5640, while this one comes with OV2640. Of course, these two cameras are completely pin-to-pin compatible, you can replace them yourself and modify the software driver. The screen has also been replaced with a smaller 2.4-inch ST7789 driven LCD, I personally still think this screen is too big (arrogant).
Another drawback is that the supporting software resources for this board are very few (almost none), of course, as long as it is a K210 board, the software from various manufacturers can actually be used, but due to minor hardware differences, a little source code modification is needed (such as GPIO numbering, etc.).
Overall, it is highly recommended, and the cost performance is very high, suitable for use as a main product for future projects after getting familiar with it.
3) Sipeed Maixduino development board
Sipeed is the company that previously made the Lychee Pie, and their team has also launched a series of K210 development boards to meet various needs, Sipeed Maixduino is one of them.

Embedded AI: From Basics to Advanced Applications with K210

Look at the name of this development board, it even has ‘duino’ in it, could it support Arduino programming??
Yes, but currently only supports a little.
The interfaces on the official WiKi are basically blank now, so those who are interested will have to wait a bit, temporarily put away your bold ideas.
Maixduino is clearly designed to be compatible with Arduino interfaces, the board is standard UNO type, and compared to the above two development boards, it has an ESP32 module (by the way, ESP32 is also a popular hardware). It has all the functions of the official development board, and its body is quite compact.
It is worth mentioning that the stamp hole core board on this board can be purchased separately, called M1 Module, which will be introduced below.
4) Sipeed M1 development board kit

Embedded AI: From Basics to Advanced Applications with K210

This board uses the M1 Module mentioned above, adopting a core board and baseboard design, making it much smaller, while all IOs are brought out, which can be considered a minimum system hole board. I personally quite like this design, compared to Widora’s board, the black color with gold is also more aesthetically pleasing.
The price of the board is not too expensive, and the biggest advantage of Sipeed’s K210 boards is that the software resources are very well done, with a good level of openness in the official WiKi, tutorials, and source code, considering both hardware and software conditions, Sipeed’s development boards are what I recommend everyone to purchase.
They also have other models of K210 development boards, which I won’t introduce one by one; if you are interested, you can search for them on Taobao yourself.
Programming Environment:
K210 supports several programming environments, from the most basic cmake command line development environment, to IDE development environment, to Python script-based development environment all supported, which will be introduced separately below.
There is no superiority or inferiority among these development methods; some people prefer command line + vim, some prefer IDE graphical interfaces, and some simply don’t care about the compilation environment and just want to write Python because life is short.
Generally speaking, the more basic development methods like C language + official library provide greater freedom and can fully utilize various peripheral functions of the chip, but the development difficulty is relatively high and the process is quite cumbersome; the higher-level development methods like scripting, although very convenient, even eliminating the need for the program downloading process, but the realization of program functions is extremely dependent on the MicroPython API updates, and many advanced system functions cannot be used.
1) Command line development environment
First of all, let’s talk about it; K210’s official SDK supports two development modes: FreeRTOS and Standalone (bare metal).
The specific choice of which mode depends on personal preference; if you have used FreeRTOS before, you should be able to quickly familiarize yourself with the SDK-related interfaces. Personally, I prefer to use bare metal development mode for microcontrollers because if I want to run an OS, I would prefer to use a development board that can run a complete Linux system, like the Linux-Card I made.
If you are developing using the command line, it is recommended to do so in a Linux environment (using the command line under Windows feels strange), below is the environment setup method.
1.1 Download SDK
Download the corresponding SDK from the official resource download page:
https://kendryte.com/downloads/

Embedded AI: From Basics to Advanced Applications with K210

1.2 Install Toolchain

Install build-essential to obtain the make tool
$ sudo apt install build-essential

Installcmake

$ sudo apt install cmake
Download the Ubuntu version toolchain from the Kendra website, place it in the /opt directory and unzip it.

Embedded AI: From Basics to Advanced Applications with K210

$ sudo mv kendryte-toolchain-ubuntu-amd64-8.2.0.tar.gz /opt$ cd /opt$ sudo tar -zxvf kendryte-toolchain-ubuntu-amd64-8.2.0.tar.gz

Open the ~/.bashrc file, add the following line at the end to add the /opt/kendryte-toolchain/bin directory to the PATH environment variable

export PATH=$PATH:/opt/kendryte-toolchain/bin

To make changes effective

$ source ~/.bashrc

1.3 Compile hello world project

Download kendryte-standalone-sdk from Kendryte’s Github
$ git clone [email protected]:kendryte/kendryte-standalone-sdk.git
The hello world project is in the kendryte-standalone-sdk/src/hello_world directory.
Create a build directory and enter:
$ mkdir build && cd build

Run cmake

$ cmake .. -DPROJ=hello_world -DTOOLCHAIN=/opt/kendryte-toolchain/bin
Compile
$ make
A .bin file will be generated in the build directory, and then you can burn this file into the chip.
1.4 Burn firmware
K210 uses serial ISP to download programs (J-Link debugging is also supported, but I personally feel it is unnecessary, so I won’t introduce it here).
IO_16 is used for boot mode selection; at power-up reset, pull high to enter FLASH boot, pull low to enter ISP mode. After reset, IO_0, IO_1, IO_2, and IO_3 are JTAG pins, and IO_4 and IO_5 are ISP, which are UART0 pins.
The K210 development boards mentioned above all have USB-TTL serial chips on board, so the board can connect directly to the computer, select the serial port number, and you can download the program, just like the experience with Arduino. However, the download process requires a download tool called K-Flash.
The K-Flash software can also be downloaded from the official website mentioned above, the software interface is very simple, select the bin file, board model, serial port number, and click download to proceed.
Embedded AI: From Basics to Advanced Applications with K210
1.5 Package Kfpkg firmware

The firmware package for K210 mainly has two formats: .bin and .kfpkg.

.kfpkg can contain multiple .bin files or model files, allowing you to conveniently package program firmware and network model files together for one-time burning, and the burning method is the same as that for .bin files, using K-Flash. Here’s how to create a kfpkg file and use it:
Create your own .kfpkg file
.bin file is the firmware content, passed as a parameter to the burning software, which will default to burn to the 0 address of the flash, and after completion, restart to run.
However, sometimes we need to burn other binary files to flash, such as models, file systems, or other custom data; in this case, we need to specify the burning address. Just having a .bin file burning tool doesn’t tell the burning tool where we want to burn the data to flash, so packaging a .kfpkg format file is to solve this problem.
kfpkg consists of three parts:
  • flash-list.json text file: used to store the .bin file list and burning addresses, etc.

  • .bin firmware

  • Other files (binary files)

For example, if we want to download a firmware named XXX.bin and another file named YYY.bin to the Flash at address 0xA00000, we need to write a flash-list.json file with the following content:
  {  "version": "0.1.0",  "files": [    {      "address": 0,      "bin": "XXX.bin",      "sha256Prefix": true    },    {      "address": 0x00A00000,      "bin": "YYY.bin",      "sha256Prefix": false    }  ]}

Note the sha256Prefix option; the firmware needs verification, so it is true, while other data (like model data) does not need it, so it is false.

Finally, compress these three files (XXX.bin, YYY.bin, flash-list.json) into a zip file, then rename the suffix to .kfpkg to be recognized by the burning tool and burned to flash at the specified address.
2) IDE development environment
For those who want to develop in Windows, you can use the official IDE provided, which is also the development method I use most frequently.
The official IDE is developed based on Visual Studio Code, which is very user-friendly, most importantly, it has code auto-completion :D.

Embedded AI: From Basics to Advanced Applications with K210

2.1 Download IDE

Download it from this address: kendryte-ide.s3-website.cn-northwest-1.amazonaws.com.cn
After downloading, it will automatically connect to the internet to update components, click to download the official examples in the following order:
Embedded AI: From Basics to Advanced Applications with K210
The advantage of the IDE is that it can automate many dependency operations, making operations easier:
Embedded AI: From Basics to Advanced Applications with K210
Then you can compile and download:

Embedded AI: From Basics to Advanced Applications with K210

If there are compilation errors, first confirm that you have clicked to install all dependencies, and also remember to click the trash can icon on the left to clean up the project before trying to compile again.

The buttons below the IDE actually correspond to the content in the Kendryte option in the top menu bar.
3) MicroPython development environment
Both of the above development environments involve writing code, compiling, and downloading, while this method only requires downloading firmware once, and you can interact with Python via serial or run scripts stored on an SD card at startup.
This script-based interactive firmware is based on an open-source project called MicroPython.
MicroPython is a parser based on Python3 syntax, containing most of the basic syntax of Python3, mainly running on embedded chips with limited performance and memory (it first became popular on STM32), note that MicroPython does not contain all the syntax of Python3.

Embedded AI: From Basics to Advanced Applications with K210

MaixPy is a project that ports MicroPython to K210, supporting MCU conventional operations, and also integrates machine vision and microphone array modules for quick development of intelligent applications.

The open-source address for the MaixPy project: https://github.com/sipeed/MaixPy
Micropython can make programming on K210 simpler and quicker; for example, if we want to find devices on the I2C bus, we can achieve this with the following code:
from machine import I2C

i2c = I2C(I2C.I2C0, freq=100000, scl=28, sda=29)
devices = i2c.scan()
print(devices)

Similarly, to implement a breathing light, we only need the following code:

from machine import Timer,PWM
import time

tim = Timer(Timer.TIMER0, Timer.CHANNEL0, mode=Timer.MODE_PWM)
ch = PWM(tim, freq=500000, duty=50, pin=board_info.LED_G)
duty=0
dir = True
while True:
    if dir:
        duty += 10
    else:
        duty -= 10
    if duty>100:
        duty = 100
        dir = False
    elif duty<0:
        duty = 0
        dir = True
    time.sleep(0.05)
    ch.duty(duty)

Real-time photo:

import sensor
import image
import lcd

lcd.init()
sensor.reset()
sensor.set_pixformat(sensor.RGB565)
sensor.set_framesize(sensor.QVGA)
sensor.run(1)
while True:
    img=sensor.snapshot()
    lcd.display(img)
In short, it is very simple and easy to use; interested students can visit the official WiKi and tutorial website: maixpy.sipeed.com/zh/
Conclusion
This article introduced the basic ABCs of K210 from chip architecture, to development board selection, to setting up the software development environment. This KPU actually has many interesting applications, and I will introduce more in future articles, including how to use various modules in the SDK and how to deploy your AI models to run on K210.
At the same time, I am also designing a new development board, and the project will also be open-sourced. Everyone can look forward to it~ Interested students hurry up and follow!
END
Embedded AI: From Basics to Advanced Applications with K210

Darwen resident contributor ZhiHui series – I am ZhiHui, a regular contributor to “Darwen Says”, sharing cutting-edge knowledge of artificial intelligence from time to time.

Embedded AI: From Basics to Advanced Applications with K210I just follow “Darwen Says”

ZhiHui’s personal website: www.pengzhihui.xyz
ZhiHui’s past reviews:
Is it better to use a microcontroller or Raspberry Pi for beginners to play embedded hardware?
How to make a “Raspberry Pi” – DIY ARM-Linux card computer
Do you have any interesting works made with microcontrollers or open-source hardware?

Is it feasible to do a magnetic levitation device for graduation design? PID algorithm support for more stability

How to make a super mini voice assistant?

DIY a funny clock, making materials 100% open-source!

Running neural networks for gesture recognition on STM32

Embedded AI: From Basics to Advanced Applications with K210

Leave a Comment

×