The Evolution of SPI Protocol: From Single Lane to Full Interchange

Unveiling the Speed Leap of Communication Interfaces in Embedded Development

Chapter 1: Country Road – Standard SPI

Imagine a quiet two-way single-lane country road.

This is Standard SPI, which forms the foundation of all advanced modes. It primarily relies on four lines for communication:

• SCLK (Clock Line): Like a metronome, generated by the master device to synchronize all data.

• CS (Chip Select Line): Like a doorbell, the master device “calls” the slave device by pulling it low.

• MOSI (Master Out Slave In): The channel through which the master device sends data to the slave device.

• MISO (Master In Slave Out): The channel through which the slave device sends data to the master device.

How does it work?

On each clock pulse, data is transmitted simultaneously on MOSI and MISO, one bit at a time. It’s like on the road, where at each time unit, one car is allowed to go from A to B while another car goes from B to A. Features and limitations: The protocol is simple and reliable, but speed is a major drawback. When a large amount of data needs to be transmitted (such as reading from a memory chip), it becomes a “congested road.”

Chapter 2: Tidal Lane – Dual SPI

As data volume increased, engineers came up with a brilliant idea: why not turn both existing unidirectional data lines into bidirectional lanes?

Thus, Dual SPI was born! This is a classic optimization of “making a stage in a snail shell.”

• Core Change: MOSI and MISO transform into universal bidirectional data lines IO0 and IO1.

• Working Method: During data reading, each clock cycle can transmit 2 bits of data simultaneously through IO0 and IO1, effectively doubling the speed!

Each clock cycle, IO0 and IO1 output 2 bits in parallel.

The brilliance lies in not adding any physical pins! By simply changing the protocol rules, the efficiency has been doubled. It’s like a “tidal lane” in traffic management, flexibly allocating road resources to cope with peak traffic.

Chapter 3: Four-Lane Highway – Quad SPI

While dual lanes are good, applications that require instant loading of large amounts of code or data (such as firmware running on SPI Flash) need a wider runway.

Thus, Quad SPI – this “four-lane highway” was born.

• Core Change: The number of data lines increases from 2 to 4 (IO0, IO1, IO2, IO3).

• Working Method: Each clock cycle can transmit 4 bits of data! The speed is 4 times that of standard SPI.

Where do the pins come from? At this point, some pins originally used for other auxiliary functions on the chip will be repurposed. For example, the HOLD# (Hold) and WP# (Write Protect) pins on SPI Flash will elegantly transform into data lines IO3 and IO2 in Quad SPI mode. This also answers a previous question about the FSPIHD pin on the ESP32: in Quad mode, it is used as IO3 (DATA3).

Four lines in parallel, throughput surges.

Application Scenario: Almost all modern embedded devices’ external Flash use Quad SPI interfaces to ensure the CPU can quickly read instructions and data.

Chapter 4: Full Interchange – QPI (Quad Peripheral Interface)

Quad SPI is already fast, but attentive engineers have discovered a bottleneck: although data transmission is “four lanes,” the sending of start commands and addresses may still be using “single lane”!

This is like a sports car driving on a highway but waiting in line at the toll booth.

QPI was born to solve this problem; it is the ultimate form of Quad SPI.

• Core Difference:

◦   Quad SPI: Typically “four lines for data phase,” while the command and address sending phase still uses a single line.

◦   QPI: Is “four lines for all phases,” where all communication including commands, addresses, and data is completed through 4 data lines.

Workflow Comparison:

Quad SPI: [Single Line Command] -> [Single Line Address] -> … -> [Four Line Data] -> …

QPI: [Four Line Command] -> [Four Line Address] -> … -> [Four Line Data] -> …

The QPI mode eliminates any delays caused by single line communication in the protocol, achieving a fully unblocked “three-dimensional traffic” with peak efficiency. Devices typically need to use a specific switching command to enter QPI mode from Quad SPI.

Summary and Comparison

To better understand this evolution, let’s summarize it in a table:

Mode	Data Lines	Bits/Clock	Core Idea	Analogy
SPI	2 (MOSI, MISO)	1 bit	Basic, Full Duplex	Two-Way Single Lane Road
Dual SPI	2 (IO0, IO1)	2 bits	Pin Multiplexing, Efficiency Doubled	Tidal Dual Lane
Quad SPI	4 (IO0-IO3)	4 bits	Resource Increase, Performance Leap	Four-Lane Highway
QPI	4 (IO0-IO3)	4 bits	Full Process Optimization, Ultimate Efficiency	Full Interchange Without Traffic Lights

Practical Advice for Developers

Compatibility: Advanced modes are usually backward compatible. Devices default to standard SPI mode upon power-up, requiring the host to send a specific sequence of commands to switch modes.
Hardware Confirmation: During design, be sure to consult the data sheets of the master MCU and slave devices to confirm their support for Dual/Quad SPI and QPI.
Pin Planning: When using Quad SPI, be sure to pay attention to pin multiplexing (such as HOLD#, WP#), and plan accordingly in hardware design to avoid conflicts. The FSPI pins on the ESP32 are connected to the internal Flash by default, and remapping must be done with caution.
Protocol Understanding: Understanding the differences between these modes will help you choose the most cost-effective storage chips or peripherals for your project.

The evolution from SPI to QPI is a microcosm of embedded systems’ continuous pursuit of higher performance under resource constraints. I hope this article helps you better understand these concepts and achieve outstanding designs in your next project!