1. Introduction
Object detection is one of the fundamental and important problems in the field of computer vision. The goal of object detection is to determine the location of target instances in natural images based on a large number of predefined categories. For example, in a given image, is there a predefined category present, and what is the spatial location and coverage of that category in the image? As a cornerstone of image understanding and computer vision, object detection is the basis for solving higher-level visual tasks such as segmentation, scene understanding, object tracking, image description, event detection, and activity recognition.
(from paper:Deep Learning for Generic Object Detection: A Survey)
Monocular distance measurement refers to calculating the distance between the target object and the camera body using a single camera. From the previous article “Camera Calibration and Distortion Correction – Taking Raspberry Pi Wide Angle as an Example”, we already know that the image output by the camera is a projection of three-dimensional objects in the real world after certain translation and rotation onto the camera’s imaging plane. The imaging process is a 3D to 2D process, and distance cannot be directly obtained. Here, the distance between the target object and the camera’s imaging plane can be estimated using the bounding box output from object detection and the camera’s intrinsic parameters and installation height.
2. Object Detection on Raspberry Pi
Due to the limited processing power of the Raspberry Pi, it cannot run object detection methods like YOLO at a high frame rate. By chance, I saw the MobileNet model based on the TensorFlow-Lite framework on the Qengineering website. For details, see the paper “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”. After testing, it can run on both the Ubuntu and Buster versions of the Raspberry Pi system, with frame rates of 10fps (32 bits OS) and 18fps (32 bits OS) without overclocking. With object detection, we can obtain the bounding box of the measured object, and based on the bounding box, we can measure distance using the camera.
To run the TensorFlow-Lite version of the MobileNet model, you need to install the image processing library (OpenCV) and TensorFlow-Lite (tensorflow2.4) on the Raspberry Pi. Next, let’s take the Ubuntu image (64 bits) for Raspberry Pi 4B that we provided earlier as an example and teach you how to set up the environment.
1. Install opencv4.5.2
Because compiling OpenCV requires a considerable amount of memory, it is necessary to set a larger swap space for your Raspberry Pi. For the method, see the document: “Avoid Raspberry Pi Crash, Expand Swap Space.pdf”. The extended memory plus the Raspberry Pi memory is best greater than 4GB.
https://github.com/COONEO/ROBOTMAKER_Tutorials/tree/main/Raspberry-Melodic--PDF/
Install the dependencies required for compilation:
sudo apt-get update
sudo apt-get install build-essential cmake git unzip pkg-config
sudo apt-get install libjpeg-dev libpng-dev
sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install libgtk2.0-dev libcanberra-gtk* libgtk-3-dev
sudo apt-get install libxvidcore-dev libx264-dev
sudo apt-get install python3-dev python3-numpy python3-pip
sudo apt-get install libtbb2 libtbb-dev libdc1394-22-dev
sudo apt-get install libv4l-dev v4l-utils
sudo apt-get install libopenblas-dev libatlas-base-dev libblas-dev
sudo apt-get install liblapack-dev gfortran libhdf5-dev
sudo apt-get install libprotobuf-dev libgoogle-glog-dev libgflags-dev
sudo apt-get install protobuf-compiler
sudo apt-get install qt5-default
Download the OpenCV and OpenCV_contrib source packages and compile them:
# Open a terminal and enter the following commands
cd ~
# Download files
wget -O opencv.zip https://github.com/opencv/opencv/archive/4.5.2.zip
wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/4.5.2.zip
# Unzip files
unzip opencv.zip
unzip opencv_contrib.zip
# Rename the unzipped folder
mv opencv-4.5.2 opencv
mv opencv_contrib-4.5.2 opencv_contrib
# Remove downloaded zip files
$ rm opencv.zip
$ rm opencv_contrib.zip
# Start compiling
cd ~/opencv
mkdir build
cd build
# This command takes some time and will download some libraries, please be patient
cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
-D ENABLE_NEON=ON \
-D WITH_OPENMP=ON \
-D WITH_OPENCL=OFF \
-D BUILD_TIFF=ON \
-D WITH_FFMPEG=ON \
-D WITH_TBB=ON \
-D BUILD_TBB=ON \
-D WITH_GSTREAMER=OFF \
-D BUILD_TESTS=OFF \
-D WITH_EIGEN=OFF \
-D WITH_V4L=ON \
-D WITH_LIBV4L=ON \
-D WITH_VTK=OFF \
-D WITH_QT=OFF \
-D OPENCV_ENABLE_NONFREE=ON \
-D INSTALL_C_EXAMPLES=OFF \
-D INSTALL_PYTHON_EXAMPLES=OFF \
-D BUILD_opencv_python3=OFF \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D BUILD_EXAMPLES=OFF ..
When the terminal shows “Configuring done” and “Generating done”, it means cmake is successful, and you can start the formal compilation.
Continue entering the compilation command:
# Still in opencv/build/ directory
make -j4 # Raspberry Pi is quad-core, use all resources for compilation
Wait 1-2 hours, and after the compilation is complete, install the compiled OpenCV into the system directory:
# Still in opencv/build/ directory
sudo make install
sudo ldconfig
2. Install TensorFlow-Lite
Similarly, you need to download the corresponding version of the OpenCV source code, and then compile it to generate the executable files needed later, the steps are as follows:
# Install the tools required for compilation
sudo apt-get install cmake curl
# Download TensorFlow version (2.4.0)
wget -O tensorflow.zip https://github.com/tensorflow/tensorflow/archive/v2.4.0.zip
# Unzip and rename the file
unzip tensorflow.zip
mv tensorflow-2.4.0 tensorflow
# Enter the source directory, download the dependencies required for compilation
cd tensorflow
./tensorflow/lite/tools/make/download_dependencies.sh
# Run the script to generate the C++ files required for runtime
./tensorflow/lite/tools/make/build_rpi_lib.sh
It has been found that when running the download dependency script: “download_dependencies.sh”, the script will download the dependency files to the “../tensorflow/lite/tools/make/downloads/” folder, and the downloaded content is as follows:
The reason for mentioning this content is that it is easy to fail to download on the Raspberry Pi. To make it easier for everyone, I have prepared all these files, and the method to obtain them will be at the end of the article. After obtaining, just paste these packages directly into the folder mentioned above.
Compile and install flatbuffers :
# Create a build folder and compile
cd ~/tensorflow/tensorflow/lite/tools/make/downloads/flatbuffers
mkdir build
cd build
cmake ..
make -j4
# Install
sudo make install
sudo ldconfig
After the above execution, the related files are as follows:
-
In the …/tensorflow/lite/ path, you will see the corresponding header files (.h); -
In the …/tensorflow/lite/tools/make/gen/linux_aarch64/lib/ path, you will see the corresponding libtensorflow-lite.a file; -
In the /usr/local/lib, you will see libflatbuffers.a file -
In the /usr/local/include, you will see the flatbuffers folder
3. Install Code::Blocks
To facilitate debugging, we will first use the IDE method, allowing everyone to easily compile the program, and later rewrite it as a ROS package.
sudo apt-get install codeblocks
At this point, you can have an environment that achieves the following effects!!!
(from Qengineering)
3. Distance Measurement with Raspberry Pi Camera
1. Principle of Monocular Distance Measurement
Let me show you a picture first, brothers and sisters!!!
Seeing this, I believe everyone has already discovered the purpose of implementing object detection. Once we have the object detection function, we can know the pixel value (P_bottom) of the bottom of the target object on the imaging plane; by calibrating and correcting the camera, we can solve the focal length (f) of the camera and the pixel coordinates of the optical center in the y-direction (P_center). Since the installation height of the camera is known, we can then calculate the distance (Dis) between the target object and the camera, as derived below:
(Distance Measurement Formula)
Here, the focal length f, after calibrating the camera in the article “Camera Calibration and Distortion Correction – Taking Raspberry Pi Wide Angle as an Example”, can obtain the intrinsic parameter matrix as follows:
Where the focal lengths are (fx, fy), and the optical center pixel coordinates are (Cx, Cy); according to the above schematic, we use the Y-axis data as the basis for calculation. Therefore:
2. Implementing Monocular Distance Measurement
First, we have prepared a Code::Blocks project that can run monocular detection and distance measurement on Raspberry Pi. Those who do not want to read the code description can also study it themselves. The code link is at the end of the article.
Flowchart of the program:
3. Explanation of Important Code Sections
Initializing the camera intrinsic parameter matrix and distortion matrix:
Mat_ cameraMatrix(3,3);
Mat_ distCoeffs(1,5);
static double focal_camera; // mtx(1,2)
// Initializing intrinsic parameter matrix parameters
cameraMatrix(0,0) = 382.25802374; // fx
cameraMatrix(0,1) = 0;
cameraMatrix(0,2) = 317.55530562; // Cx
cameraMatrix(1,0) = 0;
cameraMatrix(1,1) = 379.57912433; // fy
cameraMatrix(1,2) = 228.13804584; // Cy
cameraMatrix(2,0) = 0;
cameraMatrix(2,1) = 0;
cameraMatrix(2,2) = 1;
// Assigning value to focal length f
focal_camera = ( cameraMatrix(1,1) + cameraMatrix(0,0) ) / 2.0;
// Initializing distortion matrix parameters
distCoeffs(0 , 0) = -0.32621363 ;
distCoeffs(0 , 1) = 0.14117533 ;
distCoeffs(0 , 2) = -0.00088709 ;
distCoeffs(0 , 3) = 0.00128622 ;
distCoeffs(0 , 4) = -0.03548023 ;
Loading tflite model and category file:
// Load model
std::unique_ptr model = tflite::FlatBufferModel::BuildFromFile("detect.tflite");
// Build the interpreter
tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::InterpreterBuilder(*model.get(), resolver)(&interpreter);
interpreter->AllocateTensors();
// Load the pre-trained model's category file
bool result = getFileContent("COCO_labels.txt");
if(!result){ cout << "loading labels failed"; exit(-1);}
Getting image frames and correcting camera distortion through OpenCV:
VideoCapture cap(0);
cap >> frame // frame is a Mat type object
if(frame.empty()){ cerr << " End of movie " << endl; break;}
// Calibration and image undistortion, resulting image is: undistort_frame (of Mat type)
undistort(frame, undistort_frame, cameraMatrix, distCoeffs);
Object detection function:
void detect_from_video(Mat &src){ Mat image;... // Resize image to the size used during model training cv::resize(src, image, Size(width,height)); memcpy(interpreter->typed_input_tensor(0), image.data, image.total() * image.elemSize());... const float confidence_threshold = 0.48; // Probability filtering for(int i = 0; i < num_detections; i++){ if(detection_scores[i] > confidence_threshold){ int det_index = (int)detection_classes[i]+1; float y1=detection_locations[4*i ]*cam_height; float x1=detection_locations[4*i+1]*cam_width; float y2=detection_locations[4*i+2]*cam_height; float x2=detection_locations[4*i+3]*cam_width;
Rect rec((int)x1, (int)y1, (int)(x2 - x1), (int)(y2 - y1)); // Box the target object rectangle(src,rec, Scalar(0, 0, 255), 1, 8, 0); // Draw a line pointing to the target object line(src , Point(320 ,480) , Point(x1+(x2-x1)/2.0,y2) , Scalar(0,0,255),1); // Mark the possible category name in the bounding box putText(src, format("%s", Labels[det_index].c_str()), Point(x1, y1-5) ,FONT_HERSHEY_SIMPLEX,0.5, Scalar(0, 255, 0), 1, 8, 0); // Mark the calculated distance from the target object to the camera putText(src, format("%0.2f m",Calculate_object_distance(y2) ), Point(x1+(x2-x1)/2.0,y2+10) ,FONT_HERSHEY_SIMPLEX,0.5, Scalar(0, 255, 0), 1, 8, 0); } }}
Monocular distance measurement function:
// Calculate the distance from object to camera ( m. )
double Calculate_object_distance(double Pix_Bottom){ double Distance; // Corresponding to the above distance measurement formula Distance = (Camera_Pose_Height * focal_camera) / fabs( (Pix_Bottom - cameraMatrix(1,2)) ); return Distance;}
Demonstration of Effects:
Outlook and Easter Eggs
Outlook:
In the upcoming articles, we will continue to share experiences and pitfalls encountered during the robot development process. Specifically involving: ROS, Gazebo, Nvidia Jetson, Raspberry Pi, Arduino, Ubuntu, Webots, multi-line laser radar, monocular cameras, etc.
We have prepared a complete program and examples for object detection and distance measurement for you. Welcome to join our Robomaker group chat. After joining, send: “Object Detection and Distance Measurement” to obtain the prepared
image
; you can also download it from our repository:
https://github.com/COONEO/neor_mini.git # Melodic branch
If you find our articles helpful, please give our (neor_mini) repository a star (star).
Easter Egg:
We have established a new group chat for friends who love robot development, making it easier for everyone to learn, share, and exchange ideas about creating intelligent robots, and meet more like-minded partners. There will also be periodic community exclusive benefits! Follow our public account and send “Join Group” in the dialog box to get the method to join the group. Let’s make robot/unmanned driving development more efficient!
Creating content is not easy, if you like this article, please share it with your friends, share and exchange the joy of creation, and also encourage us to create more robot development guides for everyone, let’s learn by doing together!
———
References:
-
OpenCV Camera Correction and Distortion API Introduction
-
Qengineering Official Website