Advancements in Embedded Vision Technology Driven by Image Sensors

Author: Marie-Charlotte Leclerc, Teledyne e2v

According to Mems Consulting, new imaging applications are booming, from collaborative robots in “Industry 4.0”, to drones for firefighting or agriculture, to biometric facial recognition, and handheld medical devices in home care. A key factor driving these new applications is that embedded vision is more prevalent than ever before. Embedded vision is not a new concept; it defines a system that includes a vision setup that controls and processes data without an external computer. It has been widely used in industrial quality control, with the most familiar example being “smart cameras”.

In recent years, the development of economically viable hardware components in the consumer market has significantly reduced the bill of materials (BOM) costs and product sizes compared to solutions that previously relied on computers. For example, small system integrators or OEMs can now procure single-board computers or module systems such as NVIDIA Jetson in small quantities; while larger OEMs can directly obtain image signal processors like Qualcomm Snapdragon or Intel Movidius Myriad 2. On the software side, available software libraries can accelerate the development speed of dedicated vision systems and reduce configuration difficulties, even for small batch production.

The second change driving the development of embedded vision systems is the emergence of machine learning, which allows neural networks in laboratories to be trained and then directly uploaded to processors, enabling them to automatically recognize features and make decisions in real time.

Providing solutions suitable for embedded vision systems is crucial for imaging companies targeting these high-growth applications. Image sensors play an important role in large-scale adoption, as they can directly affect the performance and design of embedded vision systems, with key driving factors summarized as: smaller size, weight, power, and cost, abbreviated as “SWaP-C” (decreasing Size, Weight, Power, and Cost).

1. Cost Reduction is Crucial The accelerating driver of new embedded vision applications is the price that meets market demands, and the cost of vision systems is a major constraint to achieving this requirement.

1.1 Save Optical Costs

The first way to reduce the cost of vision modules is to shrink the product size for two reasons: first, the smaller the pixel size of the image sensor, the more chips can be manufactured from a wafer; on the other hand, sensors can use smaller and lower-cost optical components, both of which can reduce inherent costs. For example, Teledyne e2v’s Emerald 5M sensor reduces pixel size to 2.8µm, allowing S-mount (M12) lenses to be used on five-megapixel global shutter sensors, leading to direct cost savings—the entry-level M12 lens costs about $10, while larger C-mount or F-mount lenses cost 10 to 20 times more. Thus, reducing size is an effective way to lower the cost of embedded vision systems.

For image sensor manufacturers, this reduced optical cost has another impact on design, as generally, the lower the optical cost, the less ideal the sensor’s incident angle. Therefore, low-cost optics require specific displacement microlenses to be designed above the pixels to compensate for distortion and focus light from wide angles.

1.2 Low-Cost Sensor Interfaces

In addition to optical optimization, the choice of sensor interface also indirectly affects the cost of vision systems. The MIPI CSI-2 interface is the most suitable choice for cost savings (originally developed by the MIPI Alliance for the mobile industry). It has been widely adopted by most ISPs and has begun to be adopted in the industrial market because it provides a low-cost integration from companies like NXP, Nvidia, Qualcomm, or Intel on system-on-chip (SOC) or system-on-module (SOM). Designing a CMOS image sensor with a MIPI CSI-2 sensor interface allows the image sensor’s data to be transmitted directly to the embedded system’s host SOC or SOM without any intermediary converter bridge, thus saving costs and PCB space. This advantage is even more pronounced in multi-sensor-based embedded systems (such as 360-degree panoramic systems).

However, these benefits are limited by the MIPI interface’s connection distance limit of 20cm, which may not be optimal in remote configurations where the sensor is far from the host processor. In these setups, using camera board solutions that integrate longer interfaces at the expense of miniaturization is a better choice. Some off-the-shelf solutions can be integrated, such as camera boards from industrial camera manufacturers (like Flir, AVT, Basler, etc.) that are typically available in MIPI or USB3 interfaces, with the latter capable of achieving ranges exceeding 3 to 5 meters.

1.3 Reduce Development Costs

When investing in new products, the rising development costs often pose a challenge; it can cost millions of dollars in one-time development fees and put pressure on time-to-market. For embedded vision, this pressure becomes even greater, as modularity (i.e., whether the product can switch between various image sensors) is an important consideration for integrators. Fortunately, by providing a degree of cross-compatibility between sensors, such as defining a series of components that share the same pixel architecture for stable optoelectronic performance, sharing a single front-end mechanism with a common optical center, and simplifying evaluation, integration, and supply chains through compatible PCB components, development costs can be reduced.

To simplify camera board design (i.e., for use with multiple sensors), there are two methods for designing sensor packages. Pin-to-pin compatibility is the preferred design for camera board designers because it allows multiple sensors to share the same circuitry and control, making assembly completely unaffected by PCB design. Another option is to use size-compatible sensors, allowing the same PCB to work with multiple sensors, but this also means they may have to deal with variations in interface and wiring for each sensor.

Figure 1 Image sensors can be designed to provide pin compatibility (left) or size compatibility (right) for proprietary PCB layout design

2. Energy Efficiency Provides Better Standalone Capability

Battery-powered micro devices are the most obvious beneficiaries of embedded vision applications, as external computers hinder any portable applications. To reduce system energy consumption, image sensors now incorporate multiple functions that enable system designers to save power.

From the sensor’s perspective, there are several ways to reduce the power consumption of embedded vision systems without compromising frame rate capture. The simplest way is to minimize the dynamic operation of the sensor itself at the system level by using standby or idle mode for as long as possible. Standby mode reduces the power consumption of the sensor to less than 10% of its working mode by shutting down emulation circuits. Idle mode can halve the power consumption and allow the sensor to restart image acquisition in microseconds.

Another method for sensor design to integrate energy-saving features is to adopt advanced lithography node technology. The smaller the technology node, the lower the voltage required to switch transistors, as power consumption is proportional to voltage, thus reducing power consumption. For instance, pixels produced using 180nm technology ten years ago not only shrank transistors to 110nm but also reduced the voltage of digital circuits from 1.9 volts to 1.2 volts. The next generation of sensors will use 65nm technology nodes, making embedded vision applications more energy-efficient.

Lastly, by selecting the appropriate image sensor, LED light consumption can be reduced under certain conditions. Some systems must use active illumination, such as 3D mapping, motion stalling, or purely using sequential pulses to specify wavelengths to enhance contrast. In these cases, reducing the image sensor’s noise in low-light environments can achieve lower power consumption. By reducing sensor noise, engineers can decide to reduce current density or decrease the number of LEDs integrated into the embedded vision system. In other cases, when image capture and LED flashing are triggered by external events, choosing the right sensor readout structure can significantly save energy. Using traditional rolling shutter sensors, LEDs must be fully on during full-frame exposure, while global shutter sensors allow LEDs to be activated only in parts of the frame. Therefore, using global shutter sensors instead of rolling shutter sensors under pixel-wise correlated double sampling (CDS) applications can save illumination costs while maintaining low noise levels comparable to CCD sensors used in microscopes.

3. On-Chip Functionality Paves the Way for Application-Specific Vision Systems

Some fringe extension concepts of embedded vision lead us to customize image sensors comprehensively, integrating all processing functions (system on chip) in a 3D stacked manner for optimized performance and power consumption. However, the cost of developing such products is high, and achieving this level of integration with fully customized sensors is not entirely impossible in the long run. We are currently in a transitional phase, including embedding certain functions directly into sensors to reduce computational load and speed up processing times.

For example, in barcode reading applications, Teledyne e2v has patented technology that embeds a proprietary barcode recognition algorithm into the sensor chip, allowing it to identify the barcode position within each frame, enabling the image signal processor to focus only on these areas, improving data processing efficiency.

Figure 2 Teledyne e2v Snappy five-megapixel chip automatically identifies barcode positions

Another feature that reduces processing load and optimizes “good” data is Teledyne e2v’s patented fast exposure mode, which allows the sensor to automatically adjust exposure time to avoid saturation under varying lighting conditions. This feature optimizes processing time as it adapts to fluctuations in illumination within a single frame, and this rapid response minimizes the number of “bad” images the processor needs to handle.

These features are often specific and require a good understanding of the customer’s applications. As long as there is sufficient understanding of the application, various other on-chip functionalities can be designed to optimize embedded vision systems.

4. Reduce Weight and Size to Fit Minimal Application Space

Another major requirement for embedded vision systems is the ability to fit into tight spaces or to be lightweight for use in handheld devices or to extend battery-powered product runtime. This is why most embedded vision systems currently use low-resolution small optical format sensors ranging from 1MP to 5MP.

Reducing the size of pixel chips is just the first step in decreasing the footprint and weight of image sensors. The current 65nm process allows us to shrink global shutter pixel sizes to 2.5µm without sacrificing optoelectronic performance. This manufacturing process enables full HD global shutter CMOS image sensors to meet mobile market requirements of less than 1/3 inch specifications.

Another major technology for reducing sensor weight and footprint is to shrink package sizes. Chip-scale packaging has rapidly grown in the market over the past few years, particularly noticeable in mobile, automotive electronics, and medical applications. Compared to traditional ceramic land grid array (CLGA) packaging commonly used in the industrial market, chip-scale fan-out packaging can achieve higher density connections, making it an excellent solution for the lightweight and miniaturization challenges of embedded system image sensors. For instance, Teledyne e2v’s Emerald 2M image sensor chip-scale package has a side height that is only half that of ceramic packaging, while its size is reduced by 30%.

Figure 3 Comparison of the same chip using CLGA packaging (left) and wafer-level fan-out organic packaging (right), which can reduce footprint, thickness, and cost

Looking ahead, we expect new technologies to further achieve smaller sensor sizes required for embedded vision systems.

3D stacking is an innovative technology for semiconductor device production, based on the principle of manufacturing various circuit chips on different wafers and then stacking and interconnecting them using copper-to-copper connections and through-silicon vias (TSV) technology. 3D stacking allows devices to achieve a smaller footprint than traditional sensors due to its multilayer overlapping chips. In 3D stacked sensors, readout and processing chips can be placed beneath pixel chips and row decoders. Thus, the footprint of the sensor is reduced due to the smaller readout and processing chips, and more processing resources can be added to reduce the load on the image signal processor.

Figure 4 3D chip stacking technology helps achieve combinations of pixel chips, emulation, digital circuits, and even additional processing chips for specialized applications, reducing sensor area

However, for 3D stacking technology to gain widespread adoption in the image sensor market, several challenges remain. Firstly, it is an emerging technology, and secondly, it is more expensive due to the additional processing steps, making chip costs more than three times higher than those made using traditional technologies. Therefore, 3D stacking will mainly be a choice for high-performance or very small footprint embedded vision systems.

In summary, embedded vision systems can be summarized as a “lightweight” vision technology that can be used by various types of companies, including OEMs, system integrators, and standard camera manufacturers. “Embedded” is a generic description that can be applied to various applications, and thus cannot provide a list of its characteristics. However, several applicable rules exist for optimizing embedded vision systems; generally, market drivers do not come from super-fast speeds or ultra-high sensitivities, but rather from size, weight, power, and cost. Image sensors are the main drivers of these conditions, so careful selection of the appropriate image sensor is needed to optimize the overall performance of embedded vision systems. The right image sensor can provide embedded designers with more flexibility, save bill of materials costs, and reduce the footprint of illumination and optical components. It also allows designers to choose from a wide range of economically viable image signal processors with optimized deep learning capabilities from the consumer market without facing more complexities.

Further Reading:

“Machine Vision in Industry and Automation – 2018 Edition”

“Embedded Image and Vision Processing – 2017 Edition”

“3D Imaging and Sensing – 2018 Edition”

“Current Status of the CMOS Image Sensor Industry – 2018 Edition”

Recommended Training:

“Advanced MEMS Training Course – 2019”to be held from May 31 to June 2 in Suzhou, this course invites outstanding lecturers from the MEMS industry to analyze core MEMS technologies starting from MEMS devices: (1) Insights from the development of micro-electro-mechanical systems (MEMS) to nano-electromechanical systems (NEMS); (2) MEMS accelerometers, gyroscopes, and combined inertial sensor technologies and applications; (3) Optical sensors and gas sensor technologies and applications; (4) MEMS resonators and oscillator technologies and applications; (5) MEMS manufacturing processes; (6) MEMS packaging technologies; (7) Site visit to MEMS pilot platforms and mass production packaging lines in Suzhou.If you are interested, please contact:

Mems Consulting

Contact: Zhao Tingting

Phone: 18021192087

E-mail: [email protected]

Related posts

Leave a Comment Cancel reply