Understanding eMMC Interface for High-Speed Circuits

eMMC, as a common hardware interface in high-speed circuit design, is often used for storage in IOS systems or for storing critical product information. This article will analyze the eMMC protocol and combine it with some unconventional bugs encountered in actual work, aiming to master the circuit design and problem analysis of the eMMC interface through this article.In simple terms, eMMC = NAND Flash + Controller + Standard Package. eMMC uses a 10-pin interface, packaging the standard NAND Flash and MMC controller in a BGA chip. Currently, single-chip capacities can reach 64GB, with a maximum data transfer rate of 200MB/s. Regarding the bus architecture of eMMC, aside from the power and Reset pins, the main signals used for data transmission are Clock, CMD, and 8-bit Data lines.

eMMC Interface Signals

Clock: Used for synchronous clock signals.
Data Strobe: This signal is the clock signal output from the Device side, with the same frequency as the CLK signal, used to synchronize the data output from the Device side.
CMD:This signal is used to transmit the Host’s command (any operation must start with CMD, and can only be sent from host to device, and must be completed serially) and Device’s response (upon receiving the CMD sent by the host, the device will respond by sending a response through the command line, which can only be sent from device to host, and can only be transmitted serially through the command line).
Data[7:0]:Used for transmitting data on an 8-bit bus.

Understanding eMMC Interface for High-Speed Circuits

eMMC Mode

Introduction to eMMC Process

For common high-speed hardware interfaces such as DDR, USB, and PCIe, the general process is to first perform initialization negotiation, adapt to the corresponding transmission rate after negotiation, and finally conduct high-speed signal transmission. The eMMC interface is no exception; the eMMC protocol defines and constrains the entire signal transmission process. Next, we will briefly describe the entire eMMC initialization process.

In the entire initialization process, the eMMC interface only uses CLK, CAM, and RST signals to complete the initialization flow. During the initialization process, the frequency of CLK is 400KHz, and after the initialization negotiation is completed, the frequency of CLK will change from 400KHz to 52MHz (DDR mode) or 200MHz (HS200 mode) or 200MHz (HS400 mode).

Boot Mode

In the boot working mode, the host (usually the CPU) can keep the CMD line low or send CMD0 with the parameter 0xFFFFFFFA before sending CMD1, or under the effective condition of hardware reset, the device will enter boot mode while the host reads data from the slave (eMMC device). The data can be read from the boot area or user area, depending on the register settings. When all enabled boot data is sent to the host, the boot operation will end. After performing the boot operation, the slave will be ready for CMD1 operation. The host needs to send CMD1 to start the normal eMMC initialization routine.

Understanding eMMC Interface for High-Speed Circuits

The host can end the boot mode by sending CMD0 (reset). If the host sends CMD0 (reset) in the middle of data transmission, the slave must end data transmission or confirm mode transmission within N ST clock cycles (1 data cycle and 1 stop bit cycle). The state diagram under eMMC boot mode is shown below, and it can be analyzed in conjunction with this state diagram during the problem analysis phase. Understanding eMMC Interface for High-Speed Circuits

Note:

From the protocol, it can be seen that when the host sends CMD0 to the slave or performs hardware reset on the slave, either of these conditions will cause the host to enter boot mode. After completing the boot mode, it will proceed to the next mode.
To enable hardware reset, the RST_n_ENABLE bit of the CSD register needs to be configured, and this register only needs to be configured once to take effect.
Since most CPUs or SoCs have designated eMMC interfaces, the host usually sends CMD0 to initiate boot mode, so there is not a strong demand for starting boot mode through hardware reset, because during the internal initialization routine of the device after power-up, the device may not be able to detect the RST_n signal, as the device may not be able to complete loading the RST_n_ENABLE bit from the extended CSD register to the controller.

Understanding eMMC Interface for High-Speed Circuits

Device Identification Mode

After the eMMC device completes the boot mode, it will enter the device identification mode, where the main tasks include device reset, verifying the working voltage range, and the device identification process.

In device identification mode, the host will send CMD1 to the eMMC device for verification. Only when the eMMC device informs the host that it is ready (mainly reflected in the OCR register) will the next CMD command be sent; otherwise, the host will keep waiting for the eMMC device’s response. The following diagram explains the bit positions corresponding to the OCR register.

Understanding eMMC Interface for High-Speed Circuits

CMD1 The busy bit in the response can be used to let the device inform the host that it is still working on the power-on/reset routine (for example, downloading register information from the storage area). In this case, the host must repeat CMD1 until the busy bit is cleared or if the host does not receive a response from the eMMC device within 1 second, it will print initialization failure.

Data Transfer ModeMode

This process mainly completes the eMMC rate adaptation, such as the host sending SEND_CSD (CMD9) to obtain device-specific data (CSD register), such as block length, device storage capacity, maximum clock rate, etc.

Understanding eMMC Interface for High-Speed Circuits

The protocol stipulates several common transmission modes, such as “high speed” is between 26MHz and 52MHz, “HS200” is above 52MHz with a maximum of 200MHz, and “HS400” is at a maximum of 200MHz.

Understanding eMMC Interface for High-Speed Circuits

Note:

HS400 differs from HS200 mode in that HS400 uses dual-edge mode (Dual) and is only 8-bit mode.
It should be noted that when using HS200 or HS400 mode, the VCCQ voltage must be set to 1.8V or 1.2V to take effect.
Additionally, in HS200 and HS400 modes, the I/O driving strength can be configured, with no default configuration, i.e., 0X0.

eMMC Pin Configuration

Power Voltage

In e•MMC, VCC is used for memory devices, and VCCQ is used for controller and eMMC interface voltage. VCC or VCCQ can be used for memory interface voltage. The internal regulator is optional (default not processed), and is only needed when the internal core logic voltage is regulated from VCCQ.

Understanding eMMC Interface for High-Speed Circuits

Data and Clock Pins

Due to the open-drain output of the data, clk, cmd, and other pins of the eMMC chip during the initialization process, pull-up processing needs to be performed externally during schematic design. Additionally, since data is a bidirectional signal during high-speed transmission, monotonicity of the signal should be considered, and series resistance should be reserved at both ends of the master control and eMMC device for later debugging.

Understanding eMMC Interface for High-Speed Circuits

Wiring Requirements

The PCB trace impedance is controlled at 50Ω.
The source end should be terminated, with the termination resistor value between 22~33Ω, needing actual simulation and debugging.
The data trace length should not exceed 5000mil, with equal length control, especially when set to HS200 or HS400 mode.
Complete reference ground plane, prohibited from crossing divisions.

Typical Problem Analysis

File System Corruption

Regarding the usage environment of eMMC chips, they are mostly used for storing file systems or for backing up data. Therefore, designers hope for high stability of eMMC chips and that data should not be lost. How to ensure the stability of the data stored in the eMMC chip, especially when a sudden hardware power failure occurs? At this point, the eMMC chip and CPU are still performing read and write operations. If a power failure occurs, what should be done with the data during the power-off period?

The common solutions to this problem are two: the first is software configuration fsck repair, and the second is hardware configuration for reliable writing.

Brief Introduction to fsck Repair Principle

fsck (File System Check) is a tool used to check and repair file system errors, widely used in Linux and other Unix-like operating systems, to detect and repair logical and physical errors in the file system to ensure the consistency and integrity of the file system.
fsck’s general principle: mainly checks the consistency of superblocks, inodes, and data blocks. File system writes data in sequence; if a power failure occurs in the middle, it may lead to file system inconsistency. fsck uses existing file system data to recover the parts lost during the power failure.
Overall, the goal of fsck is to ensure the integrity and availability of the file system by detecting and repairing errors within it. It achieves repair by analyzing the structure of the file system, repairing damaged data structures, recovering lost data, and applying incomplete operations. This ensures the normal operation of the file system and provides reliable data storage and access.

What is Reliable Writing?

The principle of reliable writing is that in the firmware, it guarantees that both new and old data exist simultaneously. If the write is successful, it updates to the new data; if it fails, it retains the old data.
Reliable writing arises from the paired page issue in NAND Flash. If a power failure occurs during the writing operation, it is very likely to corrupt previously written data.
The core of reliable writing is that before new data is successfully written to a certain logical address, the old data in that logical address remains unchanged. This condition provides the possibility for data recovery after a power failure.

Brief Explanation of the Paired Page Issue

This refers to the situation where, after a cell’s low page (LSB) has been programmed, if the data of the high page (MSB) is being programmed and a power failure occurs, the data successfully written in the LSB will also be corrupted (the following figure is cited from the IEEE journal, explaining the paired page issue by experts from Samsung).

Understanding eMMC Interface for High-Speed Circuits

It should be noted that for flash memory chips like NAND Flash, the minimum I/O unit for read and write operations is a page, usually 4KB or 8KB, and the erase operation is performed on a block consisting of several pages. Additionally, before erasing the corresponding block, a page cannot be overwritten; this characteristic is known as the write-before-erase constraint. Therefore, flash memory does not allow in-place updates and requires a logical-to-physical address mapping scheme.

NAND flash can be divided into two types: Single-Level Cell (SLC) and Multi-Level Cell (MLC), where SLC flash can store one bit per storage cell, while MLC flash can store two or more bits. In 2-bit MLC flash, a wordline’s cell can store two paired pages and can program these two paired pages twice, referred to as the Least Significant Bit (LSB) page and the Most Significant Bit (MSB) page.

Note:

Due to the paired page issue in NAND Flash, a sudden power failure during write operations may corrupt previously written data. Therefore, before writing data each time, the previous data is first stored in a backup block. If the paired page issue occurs, the previous data can be restored from the backup block.
Because the minimum unit for read and write operations in NAND Flash is a page, the explanation above aligns with theoretical deduction. When a cell’s low page (LSB) has been programmed, if the data of the high page (MSB) is being programmed and a power failure occurs, the data successfully written in the LSB will also be corrupted.

Explanation of Reliable Writing in the Protocol

Currently, common eMMC chips generally default to having the reliable writing feature configured, such as those from Samsung and Micron. However, the domestic eMMC chip I used did not have the reliable writing feature enabled by default, so designers need to evaluate performance during chip selection. The following figure shows the configuration register explanation for the reliable writing feature in the eMMC protocol specification.

Understanding eMMC Interface for High-Speed Circuits

Estimated Lifespan

Due to the underlying structure of eMMC, the number of erase and write cycles is limited. Flash memory is generally divided into three types: TLC, MLC, and SLC. TLC, Trinary-Level Cell, means 3 bits per cell, characterized by slow speed and short lifespan, with about 1000 erase/write cycles. MLC, Multi-Level Cell, means 2 bits per cell, characterized by average speed and lifespan, with about 1000-3000 erase/write cycles. SLC, Multi-Level Cell, means 1 bit per cell, characterized by good speed and lifespan, with about 100,000 erase/write cycles.

Currently, most commonly used eMMC chips are of TLC and MLC types. The logic for estimating the lifespan of eMMC chips is to use wear leveling algorithms to ensure that blocks are evenly used to prevent always using the same physical pages and blocks.

Understanding eMMC Interface for High-Speed Circuits

Additional Explanation

What happens when the eMMC chip reaches the end of its lifespan?

First situation: when checking the eMMC register ecsd[268] shows that the nominal lifespan has been reached, data can still be written, but data stability will be at risk. Moreover, over time, there may occasionally be data read/write timeout issues (the eMMC protocol has read/write time requirements), and the device can still boot normally after a power cycle.

Second situation: as time goes on, there may be issues where data cannot be written, and read data cannot be corrected (similar to a large number of bit flips, causing the internal firmware algorithm of the controller to be unable to correct the data).

Why do some manufacturers not configure reliable writing by default?

When reliable writing is configured, it means that each time data is written, the current data will be stored in a backup block first. However, due to the poor read/write performance of some manufacturers’ backup blocks, it may affect the overall read/write speed. When reliable writing is enabled, the read/write speed may decrease by about 20% compared to when it is not enabled.

Related posts

Leave a Comment Cancel reply