Author: Anand V Kulkarni, Technical Director, Atria Logic Pvt Ltd, Bangalore (India)
The H.264 codec IP developed by Atria Logic (including AL-H264E-4KI422-HW encoder and AL-H264D-4KI422-HW decoder) is ported to the Xilinx Zynq Z-7045 SoC, enabling UHD 4K@60fps video streaming operation, as shown in the figure below:
Design Module Diagram of H.264 Codec IP Developed by Atria Logic
The AL-H264E-4KI422-HW encoder IP core designed by Atria Logic is hardware-based, feature-rich, low-latency, and high-quality, aimed at H.264 (AVC) UHD Hi422 intra coding. The AL-H264E-4KI422-HW encoder is paired with the AL-H264D-4KI422-HW low-latency decoder.
Features included in this IP core are as follows:
-
Comprehensive modular design, supporting user customization and extension
-
Support for intra H.264 and Hi422 5.1 encoding and decoding
-
Integrated HDMI2.0 receiving and transmitting subsystem
-
Support for 8/10-bit encoding and decoding
-
Support for RGB, YUV 4:2:2/4:4:4
-
Low latency ~0.3 seconds
-
Support for Variable Bit Rate (VBR) and Constant Bit Rate (CBR) modes
-
Video quality 0.99% SSIM, or 50dB PSNR or higher
-
Video processing subsystems for pre/post-processing, including color space conversion, video scaling, and chroma subsampling
-
Support for Gbps Ethernet data stream output
We need to design a solution to evaluate our UHD encoding and decoding IP cores to meet the 4K@60fps performance requirements, so we need a flexible, powerful platform. Ultimately, we selected the Xilinx ZC706 evaluation kit, which is based on the Zynq Z-7045 SoC, for the following reasons:
-
It has an off-the-shelf FMC expansion board that provides a 4K HDMI video interface: TB-FMC-HDMI 4K 2.0 version sub-card
-
The rich programmable logic resources of the Zynq Z-7045 SoC can accommodate the encoder and decoder IP logic, meeting stringent timing requirements to achieve performance goals
-
The processor system of the Zynq SoC integrates a dual-core ARM Cortex-A9 MPCore processor, enabling us to modify application driver software and custom designs, such as the GUI interface for application design
The H.264 encoder supports H.264 Hi422 format configuration, 5.1 standard (3840x2160p30) intra-only encoding. Supporting 10-bit video streams means there will be no gray and color degradation from the video strip perspective. Supporting YUV4:2:2 video streams allows for better color separation—especially noticeable for red—making the image clearer. Video quality is crucial for medical imaging applications.
UHD H.264 Encoder IP Module Diagram of Atria Logic
Intra coding allows the H.264 encoder to achieve frame rate latency, while the macro-module pipeline architecture design further reduces latency to about 0.3 milliseconds. The pipeline design supports processing eight pixels per clock, enabling real-time 4K@60fps video encoding.
This H.264 encoder developed by Atria Logic uses only 78% of the programmable logic and DSP resources of the Zynq Z-7045 SoC, and 55% of the available RAM, leaving enough space for other necessary circuits.
The H.264 decoder supports H.264 Hi422 format configuration, 5.1 standard (3840x2160p30) intra-only encoding. Like the encoder, it also supports 10-bit video stream encoding, which means there will be no gray or color degradation from the video strip perspective. The decoder also supports YUV 4:2:2 video format, supports intra-decoding, and uses a pipeline architecture to achieve frame rate latency for the decoder.
UHD H.264 Decoder IP Module Diagram of Atria Logic
Low latency is crucial for any closed-loop human/machine application. When the AL-H264E-4KI422-HW encoder and AL-H264D-4KI422-HW low-latency decoder are connected via IP network, the glass latency time is approximately 0.6 milliseconds (excluding transmission time). This is about the latency time of two frames.
The implementation of the Atria Logic H.264 decoder occupies only 68% of the programmable logic resources, 35% of the DSP resources, and 45% of the RAM of the Zynq Z-7045 SoC, leaving enough space for other necessary circuits.
The HDMI subsystem includes two main modules: Xilinx LogicCore HDMI TX and RX subsystems, as shown in the figure below:
The HDMI transceiver (GTX) module sends and receives the data of serial HDMI TX and RX transmissions and converts between serial data streams and on-chip parallel data streams. The transceiver module implements the conversion between parallel and serial data using the high-speed GT transceivers of the Zynq SoC as the HDMI PHY physical layer interface.
The TX subsystem includes the sending module, AXI Video conversion, video timing control, and an optional HDCP module. The Axi Video data stream channel transmits two to four pixels to the HDMI TX subsystem per clock and supports 8, 10, and 12-bit data encoding. This data stream complies with the video transmission protocol defined in the AXI design reference manual (UG761), and the video conversion module converts the input AXI-Stream into the local video format, while the video timing controller generates the local video timing. The audio AXI Stream transmits multi-channel uncompressed audio data to the HDMI TX subsystem. The ARM Cortex-A9 processor of the Zynq Z-7045 SoC controls the sending module of the HDMI TX subsystem through the CPU interface.
The HDMI RX subsystem includes three AXI protocol interfaces. The video conversion bridge converts the captured local video into an AXI Stream data stream, outputting this video data through the AXI Video interface, complying with the protocol specifications defined in the AXI design reference manual (UG761). The video timing controller measures video timing, and the received audio is sent out through the AXI Stream audio interface. The CPU interface implements control and status data communication with peripherals.
The HDCP module is optional and not included in the standard IP core configuration.
Ensure signal offset control on data bus
Open source is mainstream: Open-source Axiom Beta 4K Cinema Camera
Double the effort, half the cost! Prodigy Kintex UltraScale Proto creates miracles
About the concept of time borrowing in latches
FPGA deep learning drone application makes global debut: Zero Degree Intelligent Control and Deep Insight Technology collaborate to unveil intelligent drone at CES
Comparison of the top 10 popular programming languages for robots, which one do you master?
[HLS Teaching Video 13] Interface Synthesis – Processing Arrays
FPGA prototype systems accelerate the design and implementation of IoT
VadaTech launches New Virtex UltraScale FPGA Carrier
LTpowerPlanner: A system-level power architecture design tool
Embedded Xilinx FPGA, achieving disruptive embedded machine vision systems through VisualApllet programming
“Making the system self-healing”: A modular redundant system based on Artix-7 with quad-core synchronization
For more information, follow up and let me know