Design of a Fatigue Monitoring System Based on Deep Learning

The design of a driver fatigue monitoring system based on deep learning is based on the detection of driver behavior characteristics, using the object detection algorithm YOLOv5. It determines whether the driver under detection is in a state of fatigue driving danger using the PERCLOS fatigue algorithm. The overall design can be divided into hardware design and software design. The hardware design uses Jetson Nano as the central intelligent computing platform, Intel RealSense D435i as the image sensor for capturing driver facial data, and uses a buzzer for early warning; the software design involves transplanting the improved YOLOv5 onto the system platform to process the obtained images to determine whether the driver is in a state of fatigue. At the same time, a visual interface for system operation is developed using Qt Designer, which displays the monitored images and results in real-time on this interface to enhance the warning effect. The overall design of the fatigue driving monitoring system is shown in Figure 1.

1. Hardware Design

The monitoring system needs to first collect facial data of the driver through a camera, input it into the intelligent computing platform, and then send the determined fatigue state warning information to the buzzer for warning the driver of dangerous driving behavior. The hardware design process is shown in Figure 2.

2. System Development Platform

The constructed driver fatigue driving behavior monitoring system platform needs to be installed in the left front central control area of the motor vehicle driving position or in front of the driver’s seat of the current complete machine factory. Given the scene requirements for vehicle-mounted equipment, the monitoring system platform designed in this paper needs to meet the characteristics of small size, large detection range, and fast real-time detection speed. After determining these characteristics, this paper chose NVIDIA Jetson Nano when determining the research architecture. As shown in Figure 3, (a) is its front view, and (b) is the back circuit diagram.

NVIDIA RealSense D435i is a low-cost, small-sized, high-performance deep intelligent computing platform. Jetson Nano can adapt to various deep learning frameworks such as TensorFlow, PyTorch, Caffe, Caffe2, Keras, and MXNet. Among them, JetPack 4.2 SDK provides a running environment for Jetson Nano and supports libraries such as cuDNN7.3 and TensorRT. It has good compatibility with commonly used frameworks for deep learning, making it easy to deploy deep learning models onto the Jetson Nano platform. The CUDA architecture can be used for computer vision, providing real-time detection and inference analysis classification capabilities, accelerating the high-performance operation of the model. It has rich peripheral devices, uses a quad-core 64-bit ARM CPU and a 128-core integrated NVIDIA GPU, providing 472 GFLOPS of computing performance for embedded designers and other researchers; it is equipped with 4GB LPDDR4 memory, using efficient, low-power packaging, with two power modes of 5W and 10W; it has 4 high-speed USB3.0 ports, a MIPI CSI-2 camera connector, HDMI2.0, DisplayPort1.3, M.2 Key-E module, MicroSD card slot, 40-pin GPIO header, and Gigabit Ethernet. The parameter configuration of the Jetson Nano development kit is shown in Table 1.

3. Facial Image Acquisition Sensor

Given that this fatigue driving monitoring system platform has certain requirements for the pixel quality of captured facial images and video streams, after research and comparison, the Intel RealSense D435i camera was selected, as shown in Figure 4. The RealSense D435i has significant advantages in high-speed data collection and information processing, and is based on the Ubuntu environment, using algorithms written in Python to accelerate the processing speed of information data. It is equipped with a 20 million pixel RGB camera and a 3D sensor, with parameter configuration shown in Table 2. This camera has a global shutter that can capture small state changes of target objects and can operate indoors and outdoors. The field of view is 85×58 degrees. It uses a central infrared dot matrix projector to project a static infrared pattern to increase the texture details of low-texture scenes, ensuring that the monitoring system designed in this paper meets the accuracy requirements for detection under actual road conditions. The infrared projector meets the safety requirements of Class 1 laser under normal operating conditions, and uses a 28-nanometer process RealSense D4 vision processor to process image data from the D435i stereo camera in real-time, thus outputting full HD RGB data for the monitoring system. Additionally, it is equipped with a discrete image signal processor to perform scaling adjustments and other processing on the images obtained from the color sensor, compensating for inaccuracies in the lens and sensor, providing better image quality. The processed color images are transmitted to the vision processor D4. This camera detects the distance to objects within the field of view (FOV) by emitting randomly focused infrared (IR) dots and records this pattern with the left and right infrared cameras.

The front of the RealSense D435i stereo camera has four round holes, from left to right, the first and third are infrared sensors; the second is the infrared laser emitter, and the fourth is the color camera, as shown in Figure 5.

According to the official SDK, the installation of the visual interface (Intel RealSense Viewer) allows for the configuration of the camera, modification of advanced controls, and post-processing settings. It enables viewing the captured facial images, visualizing point clouds, depth maps, and recording and replaying videos.

WeChat public account QR code

WeChat public account: Artificial Intelligence Perception Information Processing Algorithm Research Institute

Zhihu homepage: https://www.zhihu.com/people/zhuimeng2080

Related posts

Leave a Comment Cancel reply