Edge Computer Vision
Why Choose Edge Computing?
Embedded devices are becoming increasingly intelligent, with many machine learning and computer vision tasks being pushed to edge devices. Running AI models on such devices, while challenging, offers numerous advantages:
- Reduced Latency: Processing data on-device eliminates the wait time for transmitting data to the cloud or a central processor.
- Enhanced Privacy Protection: Sensitive data remains on the device, ensuring compliance with strict privacy regulations.
- Cost Savings on Bandwidth: Edge processing reduces the need to send large amounts of data to centralized servers.
- Increased Reliability: Systems can operate independently without a network connection.
Why External AI Accelerators are Needed?
Toradex offers a variety of System on Modules (SoMs), some of which integrate Neural Processing Units (NPUs) capable of handling different AI workloads. For example, the Verdin iMX8M Plus, Verdin iMX95, and Aquila AM69 are equipped with NPUs designed specifically for accelerating edge inference, making them suitable for numerous computer vision and machine learning applications.
While these modules provide robust AI solutions, external AI accelerators such as Hailo-8, EdgeX, MemryX, and Google Coral address challenges by offering modular, decoupled, and scalable edge AI inference solutions. This brings greater flexibility and future-proof AI capabilities.
1. Decoupling AI Processing from SoC Vendor Software One major challenge in running machine learning at the edge is adapting models to specific hardware or runtime libraries. Whether it’s the NXP eiQ platform, TI Edge AI Studio, or ONNX export tools, each has its own AI toolkit and optimization strategies. External AI accelerators separate AI workloads from other hardware, providing a unified runtime environment across multiple hardware platforms.
Example: A computer vision solution developed on an x86 device using the Hailo-8 AI accelerator can be seamlessly migrated to an Aquila AM69 module equipped with Hailo-8 without needing to reconstruct the entire AI stack. This decoupling ensures that migration can be completed with minimal adjustments, significantly shortening time-to-market.
2. Modularity and Scalability AI applications have dynamic characteristics, and performance requirements may change as workload complexity increases or new features are created. While built-in NPUs can provide solid solutions, they may sometimes struggle to adapt to new scenarios.
Introduction to Hailo
Hailo is an AI processor manufacturer whose products are designed to run advanced machine learning applications at the edge, applicable across various industries and fields such as smart cities, automotive, manufacturing, agriculture, and retail.
We tested the Hailo-8 M.2 module on several Toradex modules. The Hailo-8 M.2 module is an AI accelerator module with 26 TOPS of computing power and a PCIe Gen-3.0 4-lane M-key interface. This M.2 module can be inserted into various Toradex carrier boards for real-time deep neural network inference.
How Hailo Fully Utilizes the Toradex Ecosystem?
Offloading Preprocessing and Postprocessing Tasks
Source: https://hailo.ai/blog/customer-case-study-developing-a-high-performance-application-on-an-embedded-edge-ai-device/
A typical computer vision workflow follows a linear pattern. Starting from the camera capturing the source, until the application takes action, the image must go through every processing step. This means that if any one step takes longer than the next, that is the bottleneck.
Typically, when comparing machine learning models or hardware, we focus heavily on inference speed, but that is only part of the problem.
Complete Software Stack
Hailo is a complete AI solution that supports most steps in common machine learning workflows.
- Performance Evaluation
- TAPPAS is a code repository containing application examples.
- Model Zoo not only provides benchmark results for some models but also includes pretrained models.
- Model Training
- Some pretrained models come with a retraining environment.
- Compiler and Runtime Libraries
- Hailo Dataflow Compiler
- pyHailoRT and GStreamer plugins
From Toradex’s perspective, this workflow can be complemented by using the Torizon cloud platform.
- Performance Monitoring
- Identify any issues in advance to ensure system reliability.
- OTA Updates
- Easily update production devices.
Support for Toradex Module Hardware
Hardware
Supported Hardware Configurations
Series | Module | Carrier Board | Hailo |
Aquila | TI AM69(1+2 x PCIe 3.0) | Clover(M.2 key B+M) | Hailo-8Hailo-8L |
Aquila | NXP i.MX 95(1 x PCIe 3.0) | Clover(M.2 key B+M) | Hailo-8Hailo-8L |
Verdin | NXP i.MX 95(1 x PCIe 3.0) | Mallow(M.2 key B) | Hailo-8Hailo-8L |
Verdin | NXP i.MX 8M Plus(1 x PCIe 3.0) | Mallow(M.2 key B) | Hailo-8Hailo-8L |
Verdin | NXP i.MX 8M Mini(1 x PCIe 2.0) | Mallow(M.2 key B) | Hailo-8Hailo-8L |
Apalis | NXP i.MX8(2 x PCIe 3.0) | Ixora(Mini PCIe) | Hailo-8R mPCIe |
Software
OS | Version | Other Resources |
Torizon OS | BSP 7 | meta-hailo layer (coming soon) |
Torizon OS | BSP 6 | runtime container (coming soon) |
Torizon OS Minimal | BSP 6 | meta-hailo kirkstoneOpenEmbedded layer for GStreamer 1.0 |
tdx-reference-multimedia | BSP 6 | meta-hailo kirkstone |
YOLOv5 Example
In this example, we will run a demo application from Tappas: After completing this example, you should get output similar to the following, running at 60+ FPS (depending on your camera).
We will use:
Camera
If using a USB camera, the frame rate may be very low due to the camera’s capture speed.
Display
Verdin i.MX8MP + Mallow Carrier Board
Verdin iMX8M Plus QuadLite 1GB IT (0065) is not compatible with Framos cameras.Hailo AI Accelerator
Steps:
- Build Torizon OS from source
- Build the base Torizon OS
- Add dependencies
- Hardware Setup
- Connect the Hailo device
- Connect the camera
- Install the new image
- Check all configurations
- Run the example
Build Torizon OS from Source
Build the Base Torizon OS Image We will use the CROPS container to build the following image:
Torizon OS Distro | Machine | Torizon OS Image Target | Version |
torizon | verdin-imx8mp | torizon-minimal | 6.8.0 |
Create a working directory
cd ~ mkdir ~/yocto-workdir
Run the container (this will build the base image)
This will consume a lot of memory and take several hours to complete.
The second line of the command maps the host volume to the container’s workdir directory. Note that this folder ~/yocto-workdir was created in the previous step.
docker run --rm -it --name=crops \ -v ~/yocto-workdir:/workdir \ --workdir=/workdir \ -e MACHINE="verdin-imx8mp" \ -e IMAGE="torizon-minimal" \ -e DISTRO="torizon" \ -e BRANCH="refs/tags/6.8.0" \ -e MANIFEST="torizoncore/default.xml" \ -e ACCEPT_FSL_EULA="1" \ -e BB_NUMBER_THREADS="2" \ -e PARALLEL_MAKE="-j 2" \ # not sure if I can pass those like this torizon/crops:kirkstone-6.x.y startup-tdx.sh
Add Dependencies to the Image
To add dependencies, first navigate to the ~/yocto-workdir/layers folder.
cd ./layers
We will add the following layers:
- meta-hailo
- meta-gstreamer1.0
- meta-toradex-framos
In the torizon/crops:kirkstone-6.x.y container, run the bitbake add layers command.
bitbake-layers add-layer meta-hailo/meta-hailo-accelerator bitbake-layers add-layer meta-hailo/meta-hailo-libhailort bitbake-layers add-layer meta-hailo/meta-hailo-tappas bitbake-layers add-layer meta-hailo/meta-hailo-vpu bitbake-layers add-layer meta-toradex-framos bitbake-layers add-layer meta-gstreamer1.0
In the build-torizon/conf/local.conf file, add packages. Append the following content at the end.
IMAGE_INSTALL:append = " libhailort hailortcli pyhailort libgsthailo hailo-pci hailo-firmware" IMAGE_INSTALL:append = " gstreamer1.0 gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad" IMAGE_INSTALL:append = " v4l-utils"
Compile the image with the new layers.
bitbake torizon-minimal
You can find the installation image compatible with the Toradex Easy Installer at ~/yocto-workdir/build-torizon/deploy/images/verdin-imx8mp/torizon-minimal-verdin-imx8mp-Tezi_6.8.0-devel-<date>+build.0.tar.
Hardware Setup
Connect the Hailo Device
Insert the Hailo device into the M.2 slot of the Mallow carrier board.
Connect the Camera
Connect the camera to the MIPI-CSI interface on the Mallow carrier board.
Install the New Torizon OS Image
Use the Toradex Easy Installer (Tezi) to flash the new image onto the device.
- Download Tezi
- Put the device into recovery mode
- Install the newly compiled image
Check Installation Status
Hailo Device
sudo su hailocli scan hailocli device-info
The output of these commands should detect that the device is properly connected and the drivers are functioning correctly.
Display
gst-launch-1.0 videotestsrc ! videoconvert ! autovideosink
You should see some colorful patterns on the screen.
Camera Device
This step may vary depending on the camera used.
v4l2-ctl -d2 -D
v4l2-ctl --list-formats-ext -d /dev/video2
For Framos cameras, the output is as follows.
root@verdin-imx8mp-15445736:~# v4l2-ctl --list-formats-ext -d /dev/video2 ioctl: VIDIOC_ENUM_FMT Type: Video Capture [0]: 'YUYV' (YUYV 4:2:2) Size: Stepwise 176x144 - 4096x3072 with step 16/8 [1]: 'NV12' (Y/CbCr 4:2:0) Size: Stepwise 176x144 - 4096x3072 with step 16/8 [2]: 'NV16' (Y/CbCr 4:2:2) Size: Stepwise 176x144 - 4096x3072 with step 16/8 [3]: 'RG12' (12-bit Bayer RGRG/GBGB) Size: Stepwise 176x144 - 4096x3072 with step 16/8
In the demo, we will use the YUYV format. So keep those values in mind.
gst-launch-1.0 -v v4l2src device=/dev/video2 ! video/x-raw ! videoconvert ! autovideosink
Run the Example
Some cameras specify resolution and frame rate, so these values may need to be adjusted accordingly. This can be done by modifying the framerate value in the PIPELINE variable.
sudo su cd ~/apps/detection/ ./detection.sh
Completion
Next Steps: Pairing the Device to Torizon Cloud
In future blog posts, we will cover the following topics:
- Monitoring the device using device-related metrics from Torizon Cloud.
- Retraining models using Hailo environment containers.
- Using Torizon remote updates to change the running model version.
Why Choose Toradex?
Toradex has over 21 years of excellence in the embedded industry, providing a rich combination of computer modules (SoMs) and carrier boards to help businesses build scalable, high-performance embedded applications.
Quality and Reliability
Toradex hardware is designed for durability, ensuring stable operation even in harsh industrial environments. Using high-quality components and undergoing rigorous testing, it minimizes downtime for critical applications.
Software Ecosystem
- • Torizon OS – A user-friendly industrial Linux distribution based on Yocto.
- • Torizon Cloud – Secure OTA updates, device monitoring, and remote access features.
- • Torizon IDE – Development, debugging, and deployment through VS Code plugins.
Product Lifecycle:
Toradex commits to a product supply period of over 10 years, ensuring stability. Products will continue to receive support and remain available for an extended period.
Developer Resources
Simplifying development means accelerating deployment. Toradex provides a wealth of developer resources.
- • Comprehensive documentation.
- • Free support channels from the community and Toradex experts.
- • Development tools such as TCB, Tezi, and Torizon containers.