How Image Sensors Drive the Development of Embedded Vision Technology

Image Sensors Drive the Development of Embedded Vision Technology

New imaging applications are thriving, from collaborative robots in Industry 4.0 to drones used for firefighting or agriculture, to biometric facial recognition and handheld medical devices in homes. A key factor in the emergence of these new application scenarios is that embedded vision is more prevalent than ever before. Embedded vision is not a new concept; it simply defines a system that includes a vision setup that controls and processes data without an external computer. It has been widely used in industrial quality control, with familiar examples like “smart cameras.”

In recent years, the development of affordable hardware components from the consumer market has significantly reduced the bill of materials (BOM) costs and product size compared to previous computer-based solutions. For example, small system integrators or OEMs can now purchase small batches of single-board computers or module systems like NVIDIA Jetson; larger OEMs can directly access image signal processors such as Qualcomm Snapdragon. On the software side, market software libraries can accelerate the development speed of dedicated vision systems and reduce configuration difficulties, even for small-batch production.

The second change driving the development of embedded vision systems is the emergence of machine learning, which enables neural networks in laboratories to be trained and then directly uploaded to processors so that they can automatically recognize features and make decisions in real-time.

Providing solutions suitable for embedded vision systems is crucial for imaging companies targeting these high-growth applications. Image sensors play an important role in large-scale adoption as they can directly affect the performance and design of embedded vision systems, and their main driving factors can be summarized as: smaller size, weight, power, and cost, abbreviated as “SWaP-C” (decreasing Size, Weight, Power, and Cost).

Reducing Costs is Crucial

The accelerator for new applications in embedded vision is the price that meets market demand, and the cost of vision systems is a major constraint in achieving this requirement.

Saving Optical Costs

The first way to reduce the cost of vision modules is to shrink product size for two reasons: first, the smaller the pixel size of the image sensor, the more chips can be manufactured from the wafer; on the other hand, sensors can use smaller and lower-cost optical components, both of which can reduce inherent costs. For example, Teledyne e2v’s Emerald 5M sensor reduces pixel size to 2.8 µm, allowing M12 lenses to be used on five-million-pixel global shutter sensors, leading to direct cost savings—entry-level M12 lenses cost about $10, while larger C or F mount lenses cost 10 to 20 times that amount. Therefore, reducing size is an effective way to lower the cost of embedded vision systems.

For image sensor manufacturers, this reduction in optical costs has another impact on design, as generally speaking, the lower the optical cost, the less ideal the incident angle of the sensor. Therefore, low-cost optics require specific displacement microlenses designed above the pixels to compensate for distortion and focus light from wide angles.

Cost-Effective Sensor Interfaces

In addition to optical optimization, the choice of sensor interface also indirectly affects the cost of vision systems. The MIPI CSI-2 interface is a suitable choice for achieving cost savings (it was originally developed by the MIPI Alliance for the mobile industry). It has been widely adopted by most ISPs and has begun to be used in the industrial market because it provides a lightweight integration from low-cost system-on-chip (SoC) or system-on-module (SOM) from companies like NXP, NVIDIA, Qualcomm, Rockchip, Intel, and others. Designing a CMOS image sensor with a MIPI CSI-2 interface allows data from the image sensor to be transmitted directly to the host SoC or SOM of the embedded system without any adapter bridges, thus saving costs and PCB space. This advantage is even more pronounced in multi-sensor embedded systems (such as 360-degree panoramic systems).

However, these benefits are subject to some limitations. The MIPI CSI-2 D-PHY standard widely used in the machine vision industry relies on cost-effective flat cables, which have the drawback of limiting connection distances to 20 centimeters, which may not be ideal for remote pan-tilt setups where the sensor is far from the main processor, as often seen in traffic monitoring or surround-view applications. One solution for extending connection distances is to place additional repeater boards between the MIPI sensor board and the host processor, but this comes at the cost of miniaturization. Other solutions come not from the mobile industry but from the automotive industry: the so-called FPD-Link III and MIPI CSI-2 A-PHY standards support coaxial or differential pairs, allowing connection distances up to 15 meters.

Reducing Development Costs

When investing in new products, rising development costs are often a challenge; they can spend millions on non-recurring engineering (NRE) costs and put pressure on time-to-market. For embedded vision, this pressure becomes greater because modularity (i.e., whether the product can switch to use multiple image sensors) is an important consideration for integrators. Fortunately, one-time development costs can be controlled, specifically by providing a degree of cross-compatibility among sensors, such as defining merged/shared pixel structures for stable photoelectric performance, sharing a single frontend structure through the same optical center, and compatible PCB components (by being size-compatible or pin-compatible), thereby accelerating evaluation, integration, and supply chains, as shown in Figure 1.

How Image Sensors Drive the Development of Embedded Vision Technology

Figure 1: Image sensor platforms can be designed to provide pin compatibility (left) or size compatibility (right) for proprietary PCB layout designs.

Today, with the widespread release of so-called module and board-level solutions, the development speed of embedded vision systems is faster and more affordable. These one-stop products typically include a sensor board that can be integrated at any time, sometimes including a preprocessing chip, a mechanical front, and/or a lens interface. These solutions benefit applications by providing highly optimized sizes and standardized connectors, allowing them to connect directly to ready-made processing boards like NVIDIA Jetson or NXP i.MX ones without the need to design or manufacture intermediate adapter boards.

By eliminating the need for PCB design and manufacturing, these module or board-level solutions not only simplify and accelerate hardware development but also significantly shorten software development time since they are mostly provided with Video4Linux drivers. As a result, OEMs and vision system manufacturers can skip weeks of development time to make the image sensor communicate with the main processor, allowing them to focus on their distinctive software and overall system design. Optical modules, such as those provided by Teledyne e2v, offer a complete packaging from optics to drivers to sensor boards by integrating the lens within the module, further advancing the development of one-stop solutions.

Figure 2: The new module (right) allows direct connection to ready-made processing boards (left) via wiring without the need to design any other adapter boards.

Improving Autonomous Performance Efficiency

Devices powered by micro-batteries are a clear application instance benefiting from embedded vision, as external computers hinder portable applications. To reduce the energy consumption of systems, image sensors now incorporate various features that enable system designers to save energy.

From the perspective of sensors, there are multiple ways to reduce the power consumption of vision systems without sacrificing frame rates. The simplest method is to minimize the sensor’s dynamic operation at the system level by using standby or idle modes as long as possible, thereby reducing the power consumption of the sensor itself. Standby mode reduces the sensor’s power consumption to less than 10% of the working mode by turning off simulation circuits. The idle mode can halve power consumption and allows the sensor to restart to capture images within microseconds.

Another energy-saving method is to use more advanced lithography node technologies to design sensors. The smaller the technology node, the lower the voltage required to switch transistors, which reduces dynamic power consumption, as power consumption is proportional to the square of the voltage: P_dynamic ∝ C × V². Thus, pixels produced ten years ago using 180 nm technology not only reduce the size of transistors to 110 nm but also lower the voltage of digital circuits from 1.8V to 1.2V. The next generation of sensors will use 65nm technology nodes, making embedded vision applications more energy-efficient.

The last point is that by selecting the right image sensor, the energy consumption of LED lights can be reduced under certain conditions. Some systems must use active illumination, such as generating 3D maps, motion pauses, or simply using sequential pulses to specify wavelengths to enhance contrast. In these cases, reducing the noise of the image sensor in low-light environments can achieve lower power consumption. With reduced sensor noise, engineers can determine whether to reduce current intensity or decrease the number of LEDs integrated into the embedded vision system. In other cases, when image capture and LED flashing are triggered by external events, choosing the appropriate sensor readout structure can significantly save energy. When using traditional rolling shutter sensors, the LED lights must be fully on during full-frame exposure, while global shutter sensors allow LED lights to be activated only in certain parts of the frame. Therefore, using a global shutter sensor instead of a rolling shutter sensor when using pixel-wise correlated double sampling (CDS) can save illumination costs while maintaining noise levels as low as those of CCD sensors used in microscopes.

On-Chip Functions Pave the Way for Vision System Programming

Some extended concepts of embedded vision lead us to fully customize image sensors to integrate all processing functions (system-on-chip) in a 3D stacked manner for optimized performance and power consumption. However, the cost of developing such products is very high, and achieving this level of integration with fully customized sensors is not entirely impossible in the long run, but we are currently in a transitional phase where certain functions are directly embedded into sensors to reduce computational load and accelerate processing time.

For example, in barcode reading applications, Teledyne e2v has patented technology that embeds a proprietary barcode recognition algorithm into the sensor chip, which can locate the position of barcodes within each frame, allowing the image signal processor to focus only on these areas and improving data processing efficiency.

Figure 3. Teledyne e2v SNAPPY five-million-pixel chip automatically identifies barcode positions.

Another function that reduces processing load and optimizes “good” data is Teledyne e2v’s patented fast exposure mode, which allows the sensor to automatically adjust exposure times to avoid saturation under varying lighting conditions. This function optimizes processing time as it adapts to fluctuations in illumination within a single frame, and this rapid response minimizes the number of “bad” images that the processor needs to handle.

These functions are often specific and require a good understanding of the customer’s application. As long as there is sufficient understanding of the application, various other on-chip functions can be designed to optimize embedded vision systems.

Reducing Weight and Size to Minimize Application Space

Another major requirement for embedded vision systems is the ability to fit into tight spaces or to be lightweight, to extend the operating time of handheld devices. This is why most embedded vision systems currently use low-resolution small target sensors ranging from 1MP to 5MP.

Reducing the size of pixel chips is just the first step in reducing the size and weight of image sensor packages. The current 65 nm process allows us to reduce the global shutter pixel size to 2.5µm without compromising photoelectric performance. This manufacturing process enables full HD global shutter CMOS image sensors to meet the mobile market’s requirement of less than 1/3 inch specifications.

Another major technology for reducing sensor weight and footprint is to shrink package sizes. Wafer-level packaging has rapidly grown in the market over the past few years, especially notable in mobile devices, automotive, and medical applications. Compared to traditional ceramic (CLGA) packages commonly used in the industrial market, wafer-level fan-out packages and chip-level packages enable higher density connections, making them excellent solutions for lightweight and compact embedded system image sensors. For Teledyne e2v’s 2MP sensor, wafer-level packaging combined with smaller pixel sizes has been able to shrink to a quarter in just five years.

Figure 4: The typical evolution of image sensor sizes since 2016, driven by packaging technology improvements and pixel size reductions.

Looking Ahead, We Anticipate New Technologies to Further Achieve

Smaller Sensor Sizes Required for Embedded Vision Systems

3D stacking is an innovative technology for semiconductor device production, based on the principle of manufacturing various circuit chips on different wafers and then stacking and interconnecting them using copper-to-copper connections and through-silicon vias (TSV) technology. 3D stacking allows devices to achieve smaller package sizes than traditional sensors due to the multilayer overlapping chips. In 3D stacked image sensors, the readout and processing modules can be moved below the pixel array and row decoders. This way, the footprint of the sensor is reduced due to the smaller readout and processing modules, and more processing resources can be added to reduce the load on the image signal processor.

Teledyne

Figure 5: 3D chip stacking technology allows pixel arrays, simulation, and digital circuits to overlap, even adding additional specific application processing layers while reducing sensor area.

However, for 3D stacking technology to gain widespread application in the image sensor market, there are still some challenges. First, it is an emerging technology, and second, it is costly due to the additional process steps, making chip costs more than three times higher than those using traditional technologies. Therefore, 3D stacking will mainly be the choice for high-performance or very small package size embedded vision systems.

In summary, embedded vision systems can be summarized as a “lightweight” vision technology that can be used by different types of companies, including OEMs, system integrators, and standard camera manufacturers. “Embedded” is a generic description that can be used for different applications, so it cannot be listed to specify its characteristics. However, there are several applicable rules for optimizing embedded vision systems; in general, market drivers do not come from super-fast speeds or ultra-high sensitivity, but from size, weight, power, and cost. Image sensors are the main drivers of these conditions, so careful selection of the right image sensor is needed to optimize the overall performance of embedded vision systems.

Choosing the right image sensor can provide embedded designers with greater flexibility, not only saving bill of materials costs but also reducing the footprint of lighting and optical components. But more importantly than the image sensor, the emergence of board-level solutions in the form of imaging modules that can be readily applied paves the way for further optimization of size, weight, power, and cost, significantly reducing development costs and time with cost-effective image signal processors optimized by deep learning from the consumer market, without adding extra complexity.

END

Previous Recommendations

Direct microwave conversion capability enables deep Ka-band access.

Teledyne e2v launches new generation of high-performance global shutter CMOS image sensors.

Teledyne e2v’s new 8K image sensor provides a wide field of view for high-throughput logistics vision systems.

Related posts

Leave a Comment Cancel reply