Virtualization Features of PCIe Network Cards: ARI and ACS

1 ARI (Alternative Routing-ID Interpretation)

1.1Technical Background

To understand ARI, it is essential to grasp the traditional routing ID limitations of PCIe. A PCIe request (such as Memory Read/Write) consists of three parts in the Requester ID:

Bus Number (8 bits), Device Number (5 bits), Function Number (3 bits)

In traditional mode, a PCIe device (one Device Number) can have a maximum of 8 functions because the Function Number is only 3 bits (2^3 = 8). This is known as the “one device, eight functions” limitation.

As technology has advanced, highly integrated multifunction devices have emerged (for example, a network card controller may integrate multiple network ports, management functions, shared memory, etc.), and 8 functions are no longer sufficient. ARI was designed to break through this “8-function” limitation.

1.2Core Mechanism of ARI

ARI addresses this issue in a very clever and backward-compatible way:

It reinterprets the meaning of the “Device Number” field in the Requester ID and Completer ID.

Traditional Mode:

Device Number: 5 bits (can represent devices 0-31)

Function Number: 3 bits (can represent functions 0-7)

ARI Mode:

It merges the 8 bits of Device Number and Function Number into a single Function Number field. The new Function Number: 8 bits (can represent functions 0-255), in ARI mode, the concept of Device Number is discarded, and it is always set to 0. In simple terms, ARI increases the maximum number of functions for a device from 8 (2^3) to 256 (2^8).

1.3How ARI Works and Configuration

1)Capability Declaration: Devices that support ARI will set a flag in the PCI Express Extended Capability structure in their PCIe configuration space, indicating “I support ARI”.

2)Enabling and Control: The enabling of ARI functionality depends on its location.

For endpoint devices: If the device is directly connected to a Root Port or Switch port that supports ARI, the ARI Forwarding feature of that port must be enabled. Once ARI Forwarding is enabled, it will “understand” and allow ARI-formatted IDs from downstream devices.

For Switches: Both upstream and downstream ports of the Switch can be independently configured to enable ARI Forwarding.

A Switch that supports ARI can connect traditional devices and ARI devices and correctly convert or pass ID information between them.

3)ID Translation (Optional)

A Switch that supports ARI may need to perform ID translation when connecting traditional devices. For example, when a request from an ARI device (Function Number 0x12) is sent to a traditional device, the Switch may need to map this 8-bit Function Number (0x12) to a valid traditional Device/Function Number pair (e.g., Device 2, Function 2) to ensure the traditional device can understand it correctly.

1.4Benefits and Advantages of ARI

Breaking the function count limitation: This is the most direct benefit. It allows a physical device to integrate up to 256 independent PCIe functions without using multiple Device Numbers and additional bridging chips.

Simplifying design and reducing costs: In the era before ARI, to achieve more than 8 functions, designers had to use a virtual PCI-to-PCI bridge to create new Device Numbers, which increased design complexity and costs. ARI eliminates this need.

Improving bus utilization: It reduces the number of devices on the bus, making the allocation of Bus Numbers more efficient. In a complex system, Bus Numbers are also a limited resource (though more abundant than Device Numbers).

Better suited for highly integrated SoCs and multifunction devices: Modern SoCs and smart network cards, accelerator cards, etc., often integrate dozens or even hundreds of functional modules. ARI provides ideal hardware support for this architecture.

1.5Application Scenarios

A modern SmartNIC may include the following functions:

1)2 100G Ethernet ports (2 functions)

2)RDMA acceleration engine (1 function)

3)Encryption/Decryption engine (1 function)

4)Data compression engine (1 function)

5)Virtualization offload (multiple virtual functions, e.g., 32 VFs)

6)Management controller (1 function)

……

This list easily exceeds 8 functions. Without ARI, this card might need to be presented as multiple PCIe devices, making management very complex. With ARI enabled, all these functions can be easily managed under the same PCIe device using Function Numbers 0-255, simplifying driver loading and resource allocation.

1.6Relationship with SR-IOV and MR-IOV

ARI is particularly important when working in conjunction with SR-IOV and MR-IOV technologies:

SR-IOV: Allows a physical function to create multiple virtual functions. A physical device may support dozens or even hundreds of VFs.

MR-IOV: Allows multiple root nodes to share I/O devices, requiring a greater number of functions.

ARI provides the necessary “address space” for these virtualization technologies, allowing a physical device to accommodate a large number of PFs and VFs without encountering the 8-function bottleneck.

1.7Conclusion

ARI (Alternative Routing-ID Interpretation) is a seemingly minor but profoundly impactful functional extension in the PCIe protocol. By reinterpreting the format of routing IDs, it elegantly breaks through the hardware limitation of a maximum of 8 functions per device, raising the limit to 256. This feature is crucial for modern highly integrated multifunction devices (especially smart network cards, accelerators, and complex SoCs), simplifying hardware design, reducing costs, and providing a solid foundation for advanced virtualization features like SR-IOV. It is an indispensable component in building high-density, high-performance computing systems.

2 ACS (Access Control Services)

2.1 Technical Background

In traditional PCIe topologies, a potential assumption is that all endpoint devices located at the same PCIe level (for example, downstream of the same Switch) are mutually trusted and managed by a single system software (such as an operating system). In this model, a request initiated by one device (such as a DMA write operation) can directly access the memory or configuration space of another device, which is known as peer-to-peer (P2P) transfer.

However, in virtualization and multi-tenant environments, this assumption no longer holds. The following issues arise:

1)Security: Can a network card assigned to virtual machine A maliciously tamper with the video memory assigned to virtual machine B through P2P transfer?

2)Reliability: Can a malfunctioning or defective device cause another completely unrelated device to crash through erroneous P2P requests?

3)Isolation: In I/O virtualization, how can we ensure that devices between different virtual machines are completely isolated and cannot interfere with each other?

The capability of ACS was born to address these issues. It provides a set of hardware mechanisms to implement fine-grained access control within the PCIe topology, ensuring that P2P communication can only occur when explicitly permitted, thereby achieving mandatory isolation.

2.2How ACS Works

The core idea of ACS is to set “checkpoints” along the path of P2P requests. These checkpoints are typically located at:

ØUpstream and downstream ports of the PCIe Switch.

ØIntegrated endpoints or downstream ports of the Root Complex.

When a P2P request (for example, a Memory Write from Endpoint A to Endpoint B) attempts to pass through a port that supports ACS, the ACS hardware logic of that port will perform a series of checks, and only requests that pass all checks will be forwarded; otherwise, they will be blocked or redirected.

The key checks performed by ACS include:

nRequest Direction Check:

1)Upstream Forwarding: Controls requests from downstream ports to upstream ports.

2)Downstream Forwarding: Controls requests from upstream ports to downstream ports.

3)Peer-to-Peer Forwarding: Controls requests from one downstream port to another downstream port. This is the most critical check for achieving device isolation.

nTransfer Type Check:

ACS can independently control different types of requests:

1)P2P Egress Control: Controls whether non-request transactions (such as Memory Write) are allowed to be forwarded to peer devices through the port.

2)Direct Translated P2P: Controls whether P2P requests initiated by addresses translated by the Address Translation Service (ATS) are allowed.

3)Direct P2P: Controls whether P2P requests initiated directly using addresses are allowed.

nSource ID Verification:

Determines whether to allow the request based on the Requester ID (i.e., the ID of the device initiating the request).

nAddress Range Check:

Compares the requested address with the allowed access address range configured on the port.

2.3ACS Register Structure and Configuration

The ACS capability is implemented through the ACS Extended Capability structure in the PCIe configuration space. Key registers include:

ACS Capability Register: Indicates which ACS features (e.g., source verification, P2P blocking, etc.) are supported by the port.

ACS Control Register: Software configures this register to enable or disable specific ACS check policies. For example:

Setting Peer-to-Peer Request Redirect Enable to 1 will redirect all blocked P2P requests to the Root Complex for processing instead of silently dropping them; setting Peer-to-Peer Egress Blocking Enable to 1 will directly block all outbound P2P requests.

Egress Control Vector Register (if supported): Provides finer control, allowing P2P policies to be set individually for each downstream port.

2.4Role and Significance of ACS

Enhancing virtualization security: In SR-IOV environments, ACS is crucial. It ensures that:

A VF of a physical function cannot interfere with other VFs on the same physical device. Unauthorized direct communication between VFs assigned to different virtual machines is prevented. This is the cornerstone of secure, multi-tenant I/O virtualization.

Improving system reliability: By isolating problematic devices, it prevents them from corrupting data of other critical devices (such as storage controllers, system memory) through P2P transfers, thereby enhancing the robustness of the entire system.

Achieving mandatory isolation: Even if there are flaws or malicious behavior in the drivers within the operating system or hypervisor, the isolation provided at the hardware level by ACS cannot be bypassed, offering a deeper level of defense.

Flexible topology strategies: System software can set different ACS policies on different Switch ports as needed. For example, in a multi-root server, completely independent access rules can be configured for different partitions.

2.5Practical Application Scenarios

nScenario:

A server in a cloud computing data center, configured with a smart network card that supports SR-IOV and ACS, running multiple tenant virtual machines.

nHardware Environment:

A smart network card that has created multiple VFs;

A GPU card that also supports SR-IOV;

A PCIe Switch that supports ACS.

nConfiguration:

VM1 is assigned network card VF1 and GPU VF1.

VM2 is assigned network card VF2 and GPU VF2.

nACS Policy:

On the downstream port of the Switch connecting the network card and GPU, enable P2P Egress Blocking.

However, configure an exception for the network card VF1 and GPU VF1 of VM1 to allow P2P communication (for example, for GPU Direct RDMA technology). Similarly, configure an exception for the network card VF2 and GPU VF2 of VM2.

nEffect:

The network card of VM1 can directly and efficiently write data to the GPU memory of VM1 without loss of performance. At the same time, the network card of VM1 absolutely cannot access the GPU memory of VM2, and vice versa. Even if the drivers in VM1 are controlled by malware, they cannot bypass this hardware isolation. This achieves “high performance while ensuring strict security isolation.”

2.6Relation to Related Technologies

SR-IOV: ACS and SR-IOV are a “golden pair.” SR-IOV provides virtualization capabilities, while ACS provides essential security isolation for these virtual functions.

ATS/PRI: In systems involving IOMMU, ACS works in conjunction with Address Translation Services to ensure that P2P requests that have undergone address translation are also subject to access control.

IOMMU: ACS performs initial, path-based filtering within the PCIe topology, while IOMMU performs final, address-based permission checks on the memory controller side. Together, they form a layered defense system.

2.7Conclusion

ACS (Access Control Services) is a critical security and isolation feature in the PCIe protocol. It implements fine-grained access control policies through hardware mechanisms at PCIe Switch and Root Complex ports, effectively partitioning the shared PCIe architecture into multiple protected isolation domains. This has evolved from an “advanced feature” to an essential foundational requirement for modern virtualized cloud environments, multi-root systems, and any scenario requiring strong device isolation, making it one of the core technologies for building trusted and reliable data center infrastructure.

Leave a Comment