There is no doubt about Sony’s dominant position in the sensor industry, and the company has never relaxed its efforts, accelerating its development of new sensor technologies. Recently, Sony announced the release of sensors with built-in artificial intelligence, marking the world’s first image sensor integrated with AI. This sensor features a 12.3-megapixel 1/2.3-inch back-illuminated design, supporting 4K video recording at 60 frames per second. The sensor employs a new stacked technology, where the image chip is stacked with the logic chip, capturing image information that is processed by AI before output. In the future, this sensor will be applied in industries such as retail and manufacturing for monitoring foot traffic and inventory detection. Once the technology matures, it may also find applications in the imaging industry.
Tokyo, Japan—Sony Corporation (hereinafter referred to as Sony) today announced the release of two intelligent visual sensors equipped with AI processing capabilities. These image sensors come with built-in AI processing, enabling high-speed edge AI processing while extracting only the necessary data, thereby reducing data transmission latency when using cloud services, ensuring privacy, and lowering power consumption and communication costs.
The newly released image sensors significantly broaden the research and development space for AI cameras, with the potential for widespread application in the retail and industrial equipment sectors, and will help build optimized systems connected to the cloud.
Intelligent Visual Sensors: Left: IMX500 Right: IMX501
The proliferation of the Internet of Things (IoT) has enabled all types of devices to connect to the cloud, leading to widespread application of information processing systems. The information obtained from these devices needs to be processed by AI in the cloud. On the other hand, the increasing volume of information processed in the cloud has brought various issues: increased data transmission latency hinders real-time information processing; privacy concerns related to storing personal identity data in the cloud; and rising power consumption and communication costs associated with using cloud services.
The new sensors feature a stacked structure composed of pixel chips and logic chips, with the logic chip equipped with AI image analysis and processing capabilities. The signals captured by the pixel chip are processed by AI on the sensor, eliminating the need for high-performance processors or external storage, facilitating the development of edge AI systems. The sensor outputs metadata (semantic information related to image data) instead of image information, thereby reducing data volume and ensuring privacy. Additionally, AI can provide diverse functionalities for a wide range of applications, such as real-time object tracking under high-speed AI processing. Users can also rewrite memory to select different AI models based on their needs or the system’s usage location.
Sony Intelligent Visual Sensor Introduction Video:
Main Features
· Image Sensor with AI Processing Capabilities
The pixel chip is back-illuminated with an effective pixel count of approximately 12.3 million, capable of capturing images with a wide field of view. In addition to traditional image sensor operation circuits, its logic chip is equipped with Sony’s proprietary DSP (Digital Signal Processor) specifically for AI signal processing and AI model storage. This configuration eliminates the need for high-performance processors or external storage, making it an ideal choice for edge AI systems.
· Metadata Output
The signals collected by the pixel chip are processed by the ISP (Image Signal Processor) and undergo AI computation at the logic chip processing layer, with the extracted information output as metadata, reducing the amount of data that needs to be processed. Furthermore, not outputting image information helps lower security risks and ensure privacy. In addition to the images captured by traditional image sensors, users can also choose data output formats based on their needs and applications, including ISP format output images (YUV/RGB) and ROI (Region of Interest) specific area image extraction.
Selectable data output formats to meet various needs
· High-Speed AI Processing
When using traditional image sensors to record video, it is necessary to send each frame of video data for AI processing, resulting in a large data transmission volume that makes real-time results difficult to present. Sony’s new sensor products perform ISP processing and high-speed AI operation (MobileNet V1*1 3.1 milliseconds processing) on the logic chip, completing the entire process within a single video frame. This design ensures high precision and real-time target tracking while recording video.
*1 MobileNet V1: An image analysis AI model for object recognition on mobile devices.
· Freely Selectable AI Models
Users can write their chosen AI models into embedded memory and rewrite and update them based on their needs or the system’s usage location. For example, when multiple cameras equipped with the new sensor are installed in a retail location, a single type of camera can be used in different locations, environments, times, or purposes: when installed at the entrance, it can be used to count visitors; when installed on store shelves, it can detect stock shortages; when installed on the ceiling, it can create heat maps of store traffic (detecting where many people gather), and so on. Additionally, the AI model in a designated camera can be rewritten from one used for heat map detection to one used for recognizing consumer behavior, and so forth.
Case study of camera usage in a shopping mall
Main Parameters