Comprehensive Understanding of Cameras

Click on the above “Beginner Learning Vision“, select to add “Star Mark” or “Top“

Heavyweight content delivered in real-time

This article is reprinted from | AI Algorithms and Image Processing

1. Camera Structure and Working Principle.

Comprehensive Understanding of Cameras

The scene is captured through the lens, projecting the optical image onto the sensor, which is then converted into an electrical signal. The electrical signal is converted into a digital signal through analog-to-digital conversion, processed by a DSP, and then sent to the computer for processing, ultimately converting it into an image viewable on the phone screen.

The Digital Signal Processing chip DSP (DIGITAL SIGNAL PROCESSING) function: mainly optimizes the digital image signal parameters through a series of complex mathematical algorithms and transmits the processed signals to devices such as PCs via USB and other interfaces. DSP structure framework:

　　1. ISP (image signal processor)

　　2. JPEG encoder

　　3. USB device controller

Common types of camera sensors mainly include two types,

one is the CCD sensor (Charge Coupled Device).

The other is the CMOS sensor (Complementary Metal-Oxide Semiconductor).

CCD’s advantage lies in its good imaging quality, but it has a complex manufacturing process, high cost, and high power consumption. At the same resolution, CMOS is cheaper than CCD, but the image quality is somewhat lower.CMOS image sensors have the advantage of lower power consumption, and with advancements in technology, the image quality of CMOS is continually improving. Therefore, most mobile phone cameras now use CMOS sensors.

Comprehensive Understanding of Cameras

Simple structure of mobile phone cameras

The filter has two main functions:

　　1. Filter out infrared rays.Remove infrared light that interferes with visible light, making the imaging effect clearer.

2. Adjust incoming light.The photosensitive chip consists of photosensitive elements (CELL). The best light is direct, but to avoid interference with adjacent photosensitive elements, the light needs to be adjusted. Therefore, the filter is not glass but quartz, utilizing the physical polarization characteristics of quartz to retain direct light and reflect off oblique light, avoiding interference with nearby photosensitive points.

2. Related Parameters and Terminology

1. Common Image Formats

1.1 RGB format:

Traditional red-green-blue format, such as RGB565, RGB888, with a 16-bit data format of 5-bit R + 6-bit G + 5-bit B. G has one more bit because the human eye is more sensitive to green.

1.2 YUV format:

Luma (Y) + Chroma (UV) format. YUV refers to the pixel format where brightness and chrominance are expressed separately. This separation avoids mutual interference and allows for a reduced chrominance sampling rate without significantly affecting image quality. YUV is a general term; it can be categorized into many specific formats based on its arrangement.

Chrominance (UV) defines two aspects of color – hue and saturation, represented by CB and CR. Cr reflects the difference between the red part of the RGB input signal and the brightness value of the RGB signal. Cb reflects the difference between the blue part of the RGB input signal and the brightness value of the RGB signal.

The main sampling formats are YCbCr 4:2:0, YCbCr 4:2:2, YCbCr 4:1:1, and YCbCr 4:4:4.

1.3 RAW data format:

RAW images are the original data captured by the CMOS or CCD image sensor that converts light source signals into digital signals. RAW files record the original information from the camera sensor and some metadata generated by the camera (such as ISO settings, shutter speed, aperture value, white balance, etc.). RAW is an unprocessed and uncompressed format, conceptualized as “raw image encoding data” or more vividly, “digital negatives.” Each pixel of the sensor corresponds to a color filter, distributed according to the Bayer pattern. Each pixel’s data is directly output, i.e., RAW RGB data.

Raw data (Raw RGB) is interpolated to become RGB.

Comprehensive Understanding of Cameras

Example of RAW format image

2. Related Technical Indicators

2.1 Image Resolution:

　　SXGA (1280 x 1024) also known as 1.3 million pixels

　　XGA (1024 x 768) also known as 800,000 pixels

　　SVGA (800 x 600) also known as 500,000 pixels

　　VGA (640 x 480) also known as 300,000 pixels (350,000 refers to 648 x 488)

　　CIF (352 x 288) also known as 100,000 pixels

　　SIF/QVGA (320 x 240)

　　QCIF (176 x 144)

　　QSIF/QQVGA (160 x 120)

2.2 Color Depth::

256 colors grayscale, with 256 shades of gray (including black and white).

15 or 16-bit color (high color): 65,536 colors.

24-bit color (true color): each primary color has 256 levels, combining to create 256*256*256 colors.

32-bit color: in addition to the 24-bit color, the additional 8 bits store graphical data for overlapping layers (alpha channel).

2.3 Optical and Digital Zoom:

Optical zoom: adjusts the lens to zoom in or out on the subject while maintaining pixel and image quality, allowing for capturing the desired object. Digital zoom: actually does not zoom; it simply crops and enlarges from the original image. While it appears enlarged on the LCD screen, the image quality does not improve fundamentally, and the pixel count is lower than the maximum pixels your camera can capture. In terms of image quality, it is essentially a gimmick, but it can provide some convenience.

2.4 Image Compression Methods:

JPEG/M-JPEG

H.261/H.263

MPEG

H.264

2.5 Image Noise:

　　Refers to the speckles in the image. It manifests as fixed-color speckles in the image.

2.6 Auto White Balance Processing Technology:(auto White Balance):

In simple terms: the camera’s restoration of white objects. Related concepts: color temperature.

2.7 View Angle:

　　The imaging principle is similar to that of the human eye, simply put, it’s the imaging range.

2.8 Auto Focus:

Auto focus can be divided into two main categories: one based on distance measurement between the lens and the subject, and the other based on focus detection on the focus screen (clarity algorithm).

Note: Zooming refers to bringing distant objects closer. Focusing is to make the image clear.

2.9 Auto Exposure and Gamma:

It’s a combination of aperture and shutter speed and ISO. Gamma is the response curve of the human eye to brightness.

3. Qualcomm’s CAMERA Hardware Architecture

Comprehensive Understanding of Cameras

CAMERA hardware architecture

VFE: VIDEO front-end

VPE: Video preprocessing video preprocessing

The camera module comes with an ISP (image signal processor), so the VFE and VPE functions related to image effect processing are disabled.

1. Functions of VFE:

1.1 Improve image quality through algorithms.

1.2 Provide high-resolution image AWB (auto white balance)/AE (auto exposure)/AF (auto focus) algorithm processing.

1.3 Image attenuation correction.

1.4 Noise filtering in low light.

1.5 Image color effect optimization.

1.6 Skin tone effect optimization.

1.7 Image shake calculation.

1.8 Brightness adaptation algorithm.

2. Functions of VPE:

2.1 Image stability.

2.2 Digital focus.

2.3 Image rotation.

2.4 Overlay.

Comprehensive Understanding of Cameras

3. Basic Architecture of Android System Camera

1. Application Layer

The application layer of Camera on Android is represented as a Camera application APK package developed by directly calling the SDK API. The code is located under /android/packages/apps/Camera. It mainly calls the android.hardware.Camera class (in the Framework) and implements the business logic and UI display of the Camera application. If an Android application wants to use this android.hardware.Camera class, it must declare Camera permissions in the Manifest file and add some elements to declare Camera features in the application, such as auto focus, etc. The specific implementation can be as follows:

<uses-permission android:name = “android.permission.CAMERA” />

<uses-feature android:name = “android.hardware.camera” />

<uses-feature android:name = “android.hardware.camera.autofocus” />

2. Framework Layer

2.1 android.hardware.Camera: Code location /android/frameworks/base/core/java/android/hardware/Camera.java

This part targets framework.jar. This is the Java interface provided by Android for app layer calls. This class is used to connect or disconnect a Camera service, set shooting parameters, start, stop preview, take pictures, etc.

2.2 The android.hardware.Camera class is defined in JNI and has some methods that call local code via JNI, while others are implemented by itself. The Java native call part of Camera (JNI): /android/frameworks/base/core/jni/android_hardware_Camera.cpp. Camera.java serves as a bridge from Java code to C++ code. It compiles to generate libandroid_runtime.so. The libandroid_runtime.so library is public and contains not only Camera but also other functionalities.

2.3 Client part of the Camera framework:

Code location: /android/frameworks/base/libs/camera/ five files.

Camera.cpp

CameraParameters.cpp

ICamera.cpp

ICameraClient.cpp

ICameraService.cpp

Their header files are located in /android/frameworks/base/include/camera directory.

This part compiles to generate libcamera_client.so. In the Camera module, libcamera_client.so is at the core, serving as the Client part of the Camera framework, communicating with the other part, the server libcameraservice.so, via inter-process communication (i.e., Binder mechanism).

2.4 Service part of the Camera framework:

Code location: /android/frameworks/base/services/camera/libcameraservice.

This part compiles into the library libcameraservice.so. CameraService is the Camera service, the middle layer of the Camera framework, linking CameraHardwareInterface and Client parts, calling the actual Camera hardware interface to perform functions, i.e., the underlying HAL layer.

Comprehensive Understanding of Cameras

4. Basic Data Flow and Processing Process of Camera Preview, Photography, and Video Recording, and Driver Debugging

Comprehensive Understanding of Cameras

HAL layer related code: (frameworks/base/services/camera/libcameraservice/CameraService.cpp) vendor/qcom/android-open/libcamera2/QualcommCameraHardware.cpp vendor/qcom/proprietary/mm-camera/apps/appslib/mm_camera_interface.c vendor/qcom/proprietary/mm-camera/apps/appslib/camframe.c vendor/qcom/proprietary/mm-camera/apps/appslib/snapshot.c vendor/qcom/proprietary/mm-camera/apps/appslib/jpeg_encoder.c vendor/qcom/proprietary/mm-camera/apps/appslib/cam_frame_q.c vendor/qcom/proprietary/mm-camera/apps/appslib/cam_display.c vendor/qcom/proprietary/mm-camera/targets/vfe31/8×60/vendor/qcom/proprietary/mm-camera/targets/vfe31/common/vpe1/QualcommCameraHardware.cpp is mainly divided into three parts: preview, snapshot, and video. Each is processed by a pthread. Additionally, the auto focus function is also processed using pthread. The data frames obtained from preview, photography, or video threads are callback to the upper layerCameraService.cpp for storage or preview operations. Below is a rough structure of the HAL layer code calling flow.

Comprehensive Understanding of Cameras

The entire module mainly circulates through three main threads: control, config, and frame.

control is used to execute overall control and is the upper layer control interface.

config mainly performs some configurations, this thread mainly handles 3A tasks and some settings related to effects;

the frame thread is mainly used for frame queue looping and processing. All feedback on events or statuses is returned via callback functions toQualcommCameraHardware.cpp.

2. The driver part starts from the device driver s5k8aa.c. After creating the platform device, when executing the entry function probe, it calls the function to create the camera device.

int msm_camera_drv_start(struct platform_device *dev,

int (*sensor_probe)(const struct msm_camera_sensor_info *,

struct msm_sensor_ctrl *))

and passes the device information structure and camera device call entry sensor_probe. The msm_camera_drv_start(xxx) function is implemented in msm_camera.c. It creates four device nodes for upper layer calls:

/dev/msm_camera/frame%d

/dev/msm_camera/control%d

/dev/msm_camera/config%d

/dev/msm_camera/pic%d

It implements the control call interface for the upper layer library to the VFE module, VPE module, jpeg_encoder module, and camera sensor module. In the respective functions of file_operations, the initialization and IOCTL function call interfaces for these devices are implemented.

This function also creates four work queues:

struct msm_device_queue event_q;

struct msm_device_queue frame_q;

struct msm_device_queue pict_q;

struct msm_device_queue vpe_q;

event_q includes /dev/msm_camera/control%d passed control signal queue, used to transmit control commands (command) from the upper layer to the config thread.

frame_q is used for managing image frame operations, frames will be passed to DSP for processing during preview or recording.

pict_q contains photo frames for jpeg_encoder to perform image encoding.

vpe_q is the VPE control command queue.

s5k8aa.c is the driver part for the respective camera device. Its function is simple, mainly implementing the creation, initialization, and control of the sensor module. It mainly implements the following three functions:

s->s_init = ov2685_sensor_init;

s->s_release = ov2685_sensor_release;

s->s_config = ov2685_sensor_config;

ov2685_sensor_init function:

It mainly implements the power-on, clock control (MCLK), and device initialization functions of the camera. Power-on includes DOVDD, DVDD, AVDD, reset, and PWDN, which need to be operated in order as required by the device. The clock control sequence is generally included. The device initialization process initializes all registers of the sensor device, sending the initialization register addresses and values to the sensor via IIC. After completion, the camera module can work normally and transmit images to the CPU via the MIPI line.

ov2685_sensor_config function:

It mainly implements various configuration interfaces for the sensor, including frame rate configuration, white balance effect settings, exposure settings, special effect settings, etc. The corresponding interfaces send the configured register list to the sensor via IIC.

3. Several Problem Points in Camera Debugging:

1.1 Whether powered on correctly, whether there is clock waveform output. Check if the output voltage values match the power-on timing and whether the MCLK meets the sensor’s requirements. This part can be measured using an oscilloscope and multimeter. Measure the voltage values and check if the power-on timing and MCLK frequency are correct.

1.2 Whether IIC read/write is normal. Debugging the I2C communication between CPU and ISP. Check if the IIC address is correct and if the protocol matches. This part can also be measured using an oscilloscope to check the peak values and waveform logic of IIC’s SDA and CLK.

1.3 Whether the sensor module works correctly after proper power-on and initialization. This part mainly involves measuring the data and clock PINs of the MIPI line with an oscilloscope to check if the waveform contains data, whether it is standard, and if the peak values meet the requirements.

1.4 If all of the above are correct, the MIPI controller will receive interrupts and begin processing the image signals. If errors occur, the error values of the interrupt signals can be checked for error status. In addition to whether the CPU is initialized correctly, attention should be paid to whether the image format and size set on the module match the default image format and size received by the CPU. The image format and size in the module can be checked via register values. The image size for capturing and previewing needs to be set separately in the CPU side of the HAL part.

After completing the above parts, the camera can preview correctly.

End

Download 1: OpenCV-Contrib Extension Module Chinese Tutorial

Reply in the “Beginner Learning Vision” public account background:Chinese tutorial for extension modules, to download the first OpenCV extension module tutorial in Chinese, covering extension module installation, SFM algorithms, stereo vision, object tracking, biological vision, super-resolution processing and more than twenty chapters.

Download 2: Python Vision Practical Projects 52 Lectures

In the “Beginner Learning Vision“ public account background reply: Python Vision Practical Projects, to download 31 vision practical projects including image segmentation, mask detection, lane detection, vehicle counting, adding eyeliner, license plate recognition, character recognition, emotion detection, text content extraction, facial recognition, etc., to aid in quickly learning computer vision.

Download 3: OpenCV Practical Projects 20 Lectures

In the “Beginner Learning Vision“ public account background reply: OpenCV Practical Projects 20 Lectures, to download 20 practical projects based on OpenCV, advancing OpenCV learning.

Communication Group

Welcome to join the reader group of the public account to communicate with peers. Currently, there are WeChat groups for SLAM, 3D vision, sensor, autonomous driving, computational photography, detection, segmentation, recognition, medical imaging, GAN, algorithm competitions and more (will gradually be subdivided in the future). Please scan the WeChat ID below to join the group, and note: “nickname + school/company + research direction”, for example: “Zhang San + Shanghai Jiao Tong University + Vision SLAM”. Please follow the format, otherwise, it will not be approved. Once added successfully, invitations will be sent to relevant WeChat groups based on research direction. Please do not send advertisements in the group, otherwise, you will be removed from the group, thank you for your understanding~

Related posts

Leave a Comment Cancel reply