Commercialization Solutions for Cameras in Future Autonomous Driving

Author | Aimee.

Produced by | Yanzhi

Knowledge Circle | Join the “Intelligent Automotive Interactive Community” by adding WeChat 13636581676, note interaction

The current mass-produced autonomous driving system design scheme basically adopts a multi-radar and few cameras approach, mainly considering the computing power of the autonomous driving chip. However, relying solely on multiple radars often fails to meet performance requirements, such as detecting certain road environment target types (e.g., vehicle types, lane markings) and detailed environmental detection (e.g., signs, small lane targets) must rely on cameras for effective detection. In the next-generation autonomous driving system, we have designed the central domain controller to fully cover the overall computation process for chip computing power, bandwidth, etc., our goal is to achieve full coverage with multiple cameras to achieve comprehensive and unobstructed detection capabilities of the environment, and the corresponding camera mounting scheme also needs to be thoroughly designed.This article will detail the camera sensor mounting design scheme, software information processing scheme, and performance indicator design scheme for the next-generation autonomous driving system, aiming to provide readers with a detailed explanation of the camera scheme.

Background Overview

From a hardware architecture perspective, most of the currently announced mass-produced autonomous driving system schemes adopt a 5R1V1D architecture (except for Tesla’s series of camera deployments), that is, forward detection relies on a single camera and a millimeter-wave radar, with the main detection target types including vehicles and lane markings; lateral detection relies on 4 millimeter-wave radars, with the main detection target types including vehicle targets. This kind of mounting method is prone to failures in several extreme scenarios:

1. Forward Target Detection:

Due to the forward detection process primarily using consistent target image outputs, it can be said that in many cases, the detection type cannot change image sizes, which leads to insufficient detection distance when the image resolution is too low, making it impossible to detect distant small targets in advance; conversely, when the image resolution is too high, the field of view (FOV) is insufficient, making it impossible to respond in time to targets suddenly cutting into the lane.

2. Lateral Target Detection:

In lateral detection, the current millimeter-wave radar technology is mainly applied in automatic lane changes, and the principle is quite simple, primarily relying on the reflection points of vehicle targets. Here we need to clarify that lateral radars can typically only detect vehicle targets in adjacent lanes, and have almost no capability to detect vehicle targets in the third lane, which poses significant potential dangers for autonomous driving. For example, when the vehicle changes from its current lane to the second lane, and a vehicle from the third lane simultaneously cuts into the second lane, if the vehicle in the current lane cannot effectively detect the incoming vehicle, the result could be a collision between the two vehicles.

Moreover, lateral radars cannot effectively detect lateral lane markings, cannot compensate for blind spots caused by the limitations of forward cameras, and cannot support lateral fusion of Freespace.

Camera Scheme Infrastructure

The following diagram illustrates the infrastructure of the next-generation autonomous driving system scheme. Among its key elements, the image processing SoC part mainly includes the processing of camera image sensor input through ISP and target fusion; of course, radar or LiDAR can also use raw data as input information for target input. The logic control unit MCU is responsible for processing the target fusion results from the SoC and inputting them into the motion planning and decision-making control model unit for processing, the results of which are used to generate vehicle control commands to control the vehicle’s longitudinal and lateral movements.

The following diagram shows a typical sensor architecture diagram for a camera-only mounting scheme in the next-generation autonomous driving system, with corresponding sensor mounting performance descriptions as follows.

Upgrade Scheme	Camera Scheme	Basic Performance Indicators	Function Description
Forward Detection Upgrade	1*Camera (Wide View)	FOV: 120° Detection Distance: 150m	Near-field target detection in the forward lane and side lanes
1*Camera (Narrow View)	FOV: 30° Detection Distance: 300m	Long-distance target detection in the forward lane
Lateral Detection Upgrade	4*Camera (Side View)	FOV: 100° Detection Distance: 80m	Monitoring of vehicle targets and lane information in the side lane
Rear Detection Upgrade	1*Camera (Rear View)	FOV: 80° Detection Distance: 100m	Monitoring of vehicle targets and lane information in the rear

In addition to the above basic indicators, the image resolution and frame rate of the camera also need to be considered. After addressing the computing power and bandwidth issues of the autonomous driving central controller, the higher the image resolution of the camera, the better. Considering the specific scene issues that need to be addressed during vehicle driving, generally, the higher the resolution for forward detection, the better. The currently promoted resolution for forward-facing camera detection is 8 million pixels. The target for lateral camera detection is generally side lane targets, and the scene matching library for side lanes typically has fewer scenarios for safe autonomous driving. Furthermore, the detection distance for lateral detection is also required to be closer than for forward detection, so the resolution requirement for lateral detection can be slightly lower than for forward-facing cameras, and it is recommended to use cameras with 2 million pixels to meet the requirements. Of course, if cost and processing efficiency are not considered, the sensor mounting can also refer to the forward-facing camera scheme.

Image Processing Software Architecture SoC

For the image processing software model, it is primarily placed in the SoC chip. The software modules mainly include image signal processing ISP, neural network unit NPU, central processing unit CPU, codec unit CODEC, MIPI interface output, and logic operation MCU interface output.

Among them, the two most computationally intensive units include the image signal processing ISP and the neural network unit optimization NPU.

The results of the above optimization generate basic image information results, which are input to the central processing unit CPU for comprehensive information processing allocation. Meanwhile, the decoded images are output via the MIPI interface to the video entertainment system for display, while the results of environmental semantic parsing are input to the ADAS logic control unit MCU. This module can generate environmental fusion data by fusing information with other environmental data sources (such as millimeter-wave radar, LiDAR, panoramic cameras, ultrasonic radar, etc.), which is ultimately used for vehicle functional application control.

Of course, if we consider that intelligent driving assistance control will have some information input to other intelligent control units for information fusion, the generated results need to be input to the gateway via Ethernet, which will forward the information to other units, such as intelligent parking systems, human-machine interaction display systems, etc.

Additionally, if we consider accident tracing during the autonomous driving process, it is common to input part of the information processed by the SoC into a storage card, or directly upload it to the cloud or backend monitoring via TBox.

Camera Hardware Module Design Requirements

In traditional surround view schemes, sensing the vehicle’s surrounding environment is achieved using fisheye cameras with a FOV greater than 180°. With the development and promotion of the L3 platform in the industry, many manufacturers are also trying to promote side view and rear view sensing schemes based on surround fisheye cameras. Generally, surround fisheye camera schemes cannot meet the requirements of a true L3+ autonomous driving platform for the following reasons:

The installation position of traditional surround cameras is mainly to provide intuitive visuals of 5-10 meters in front, behind, left, and right, while the positions of left front, right front, left rear, and right rear, which are of concern for L3 autonomous driving, fall within the camera image stitching range, causing perception accuracy to be affected by stitching;
Surround cameras cause significant distortion of images for distant and close objects, making it difficult to accurately predict the 3D posture, size, and motion trajectory of objects in the original image. In the corrected image, the perception accuracy decreases due to distortion;
Surround cameras are generally designed to perceive the near-field road surface, with a lower pitch angle of installation.

Therefore, the exploded view of the camera hardware module for L3+ level autonomous driving is as follows, with the corresponding performance design mainly considering the following factors: image transmission method, output resolution, frame rate, signal-to-noise ratio, dynamic range, exposure control, gain control, white balance, minimum illumination, lens material, lens optical distortion, module depth of field, module field of view, rated operating current, operating temperature, storage temperature, protection level, dimensions, weight, functional safety level, connector/wire harness impedance, etc.

The installation position and configuration of the camera largely determine the range and accuracy of visual perception. Based on the understanding of the current operational design domain (ODD), the configuration of side view cameras must not only meet the automotive standard signal-to-noise ratio and low-light performance but also meet the following parameter requirements:

Dynamic Range	Horizontal Pixels	Vertical Pixels	Frame Rate
>110dB	>1280	>960	>30

The visual information is input from the optical lens, processed through multiple stages, to the final perception algorithm detection result, which is influenced by many linear and nonlinear factors. Currently, there is no unified formula in the industry to express the relationship between camera configuration and the final detection result. To conceptually illustrate the main considerations in camera configuration, the following graph shows the relationship between sensor pixel output of a camera in typical scenarios and the types of recognized targets and distances:

Side View Camera Installation Position and Perception Range (Bird’s Eye View)

Furthermore, regarding the placement of cameras, the main points of concern include waterproof and dustproof levels and whether there are any obstructions within their detection range. Thus, the forward-facing camera is generally placed near the interior rearview mirror, secured with a bracket; overall, the layout is not complicated. For side view cameras, since they generally need to be placed on the side of the vehicle, typical positions can reference the layout of panoramic cameras, placing them below the exterior rearview mirrors or even near the dedicated fenders. This imposes new requirements on the size, dimensions, and weight of the camera’s hardware module. A basic typical camera installation scheme draft recommends the following installation positions:

Position	Horizontal View Angle	Vertical View Angle	Yaw	Pitch
Left/Right B-Pillar	100°	90°	60°/-60°	-2°
Left/Right Side Mirror	100°	90°	120°/-120°	-2°

Based on currently known information and some experiences in related development, angle crossing can ensure that a relatively narrow-angle camera effectively covers both the near and far distances around the vehicle. The functions of the side front and side rear differ slightly, with side front focusing on scene detection. The installation position and pitch angle can be slightly higher, with a wider FOV, while side rear mainly detects vehicles in the far rear and near rear adjacent lanes, used for highway entry merging and lane changing. The pitch angle can be lower, and the yaw angle closer to parallel with the vehicle’s midline, with a slightly narrower FOV. Specific parameters may need to be adjusted during the design process.

Conclusion

The requirements for sensors in the next-generation autonomous driving system are further elevated, primarily reflected in the camera perception system. Among them, the main considerations are whether the computing power and bandwidth of the autonomous driving domain controller can fully meet the requirements of the camera sensors. Additionally, basic parameters of the camera, such as detection range, angle, resolution, frame rate, etc., are also critical considerations. Regarding the hardware itself, the setup of module-related content must focus on factors such as signal-to-noise ratio, heat generation, size, dimensions, waterproofing, and dustproofing, aiming to ensure that the layout can meet detection performance requirements. Finally, regarding the new camera detection capabilities, there is a significant leap in the effectiveness of environmental detection results, such as comprehensively improving the accuracy of lane markings C0, C1, C2, C3 to support LKA, LCC, etc., through larger curvature turns. Enhancing the recognition distance and angle for adjacent lanes on both sides provides further precise input for lane change path planning, etc.

Related posts

Leave a Comment Cancel reply