Getting Started with AI on Arm Development Boards

Tengine
Tengine is a lightweight neural network inference engine developed by OPEN AI LAB, specifically optimized for Arm embedded platforms, providing excellent support for both Android and Linux systems.

Moreover, it is commendable that Tengine does not rely on dedicated AI chips (i.e., Tengine can utilize modules with dedicated AI acceleration capabilities such as GPU and NPU for AI computations, as well as general-purpose CPUs), allowing many Arm platforms to deeply exploit their computational power through the Tengine framework, enabling efficient operation of various AI applications.

Getting Started with AI on Arm Development Boards

This article teaches how to set up the Tengine AI inference framework on the RK3399 Arm64 platform and run image recognition-related applications.

The RK3399 platform used here is the Leez P710 development board, on which I have ported a Debian 10 system based on Armbian, with the u-boot and Linux kernel being mainline. For the specific process, you can refer to this article: Deploying the Latest Linux 5.4 and U-Boot v2020.01 on RK3399.

Compiling Tengine
OPEN AI LAB provides an open-source version of Tengine on Github, along with detailed reference documentation, so you can directly download the source code and compile it according to the documentation.

Thanks to the powerful performance of the RK3399, we can directly download the code and compile it on the RK3399, avoiding the inconveniences of cross-compilation.

1) Download the source code

git clone --recurse-submodules https://github.com/OAID/tengine/
Make sure to include the –recurse-submodules parameter when cloning, otherwise the download will be incomplete.
2) Install dependencies
apt install libprotobuf-dev protobuf-compiler libopencv-dev pkg-config
3) Modify the configuration file
In the default_config directory of the source code, configuration files for arm32, arm64, and x86 platforms are provided.
The RK3399 is Arm64, so the corresponding configuration file is: arm64_linux_native.config
The modification to be made is to enable the BUILD_SERIALIZER=y option in the configuration file; otherwise, you may encounter the error Shared library not found: libcaffe-serializer.so: cannot open shared object file: No such file or directory during runtime.
Getting Started with AI on Arm Development Boards

4) Compile

Execute the following command in the root directory of the source code to compile:
./linux_build.sh default_config/arm64_linux_native.config

Getting Started with AI on Arm Development Boards

5) Download model files

When running these AI applications, you need to load the corresponding model files, which can be downloaded from the cloud storage provided by OPEN AI LAB:
https://pan.baidu.com/s/1Ar9334MPeIV1eq4pM1eI-Q, extraction code is hhgc.
After downloading, place these model files in the models folder under the root directory of Tengine source code. The total size of all model files is quite large, so I only uploaded the parts needed for later testing:
Getting Started with AI on Arm Development Boards

6) Run benchmark

After compilation, two benchmark files for testing will be generated in the build/benchmark/bin/ directory, which can be executed directly to perform a simple test to verify whether the compilation was successful.
./build/benchmark/bin/bench_sqz ./build/benchmark/bin/bench_mobilenet

Compile and Run Test Demo
The Tengine open-source code also includes several good image recognition-related test demos, which are great for testing and basic learning related to AI.

The source code for these demos is located in the examples directory, and before compiling, we need to modify a compilation script linux_build.sh, setting the correct path for Tengine based on the actual situation. For example, if I downloaded and compiled the Tengine code in the /root/rockdev/tengine directory:

Getting Started with AI on Arm Development Boards

Then execute the following command in the examples directory:

mkdir buildcd build/../linux_build.sh make

After compilation, the main demos include faster_rcnn, lighten_cnn, mobilenet_ssd, mtcnn, ssd, yolov2, YuFaceDetectNet, these several test demos.
faster_rcnn
Faster rcnn is a new model proposed by the genius Ross B. Girshick in 2016 based on RCNN and Fast RCNN, achieving higher overall performance and faster detection speed.

The Tengine version of the demo performs recognition on the following image:

Getting Started with AI on Arm Development Boards
Running the faster_rcnn executable will generate an image with annotations for the detected objects:
Getting Started with AI on Arm Development Boards
It can be seen that Dog, bicycle, and car have been recognized.
YOLO v2
YOLO stands for You Only Look Once, a target detection paper published at CVPR in 2016.

YOLOv2 was published at CVPR in 2017, with the paper titled “YOLO9000: Better, Faster, Stronger”, which received the CVPR 2017 Best Paper Honorable Mention award.

Here, we use this model to detect the same image as with RCNN:

Getting Started with AI on Arm Development Boards
From this image, the accuracy is comparable to RCNN, but the detection speed is nearly 6 times faster.
SSD
SSD stands for Single Shot MultiBox Detector, which is a one-stage generic object detection algorithm proposed by Wei Liu in 2016.

Here, we use the following image for detection:

Getting Started with AI on Arm Development Boards
The running result is as follows:
Getting Started with AI on Arm Development Boards
Unfortunately, the dog was misidentified.
mobilenet_ssd
This is a combination of mobilenet and ssd, more friendly for mobile devices.

Using the same image as SSD for detection:

Getting Started with AI on Arm Development Boards
It can be seen that the dog was correctly detected, but a person in the distance was missed. However, the detection speed is much faster than SSD.
YuFaceDetectNet
This is the Tengine implementation of libfacedetection open-sourced by Professor Shi Qi from Shenzhen University, which claims to be the fastest face detection library.
MTCNN
MTCNN is also a face detection scheme proposed in 2016.

Here we use the same image as YuFaceDetectNet for testing:

Getting Started with AI on Arm Development Boards
The time difference from YuFaceDetectNet is not significant.
Conclusion
In fact, I am a novice in AI and image recognition; the purpose of writing this article is to show everyone how to deploy the open-source AI framework Tengine on nearby Arm development boards, and that even without powerful NPU or GPU, we can still engage in some AI-related practices.

If you are interested in AI or image recognition, this may be a great entry point.

END
Getting Started with AI on Arm Development Boards
Recommended Reading:
Project Sharing | Electrical Competition Series | Artificial Intelligence | Postgraduate Entrance Examination
Essential Knowledge Points | Graduation Project | Switch Power Supply | Job Hunting
We are Nimo, the founders of Darwin, who only talk about technology without flirting. The Darwin online education platform aims to serve professionals in the electronics industry, providing skill training videos covering popular topics in various subfields, such as embedded systems, FPGA, artificial intelligence, etc., and tailoring layered learning content for different groups, such as commonly used knowledge points, disassembly assessments, electrical competitions/intelligent vehicles/postgraduate entrance examinations, etc. Welcome to follow.
Official website: www.darwinlearns.com
Bilibili: Darwin
QQ Group: Group 1: 786258064 (full)
Group 2: 1057755357
Getting Started with AI on Arm Development Boards

Leave a Comment