Exploring and Reflecting on the Attack Surface of QEMU Virtualization Security

QEMU and KVM, as typical representatives of virtualization technology, are widely used in cloud computing systems across various vendors. As software with over a decade of history, QEMU has been plagued by security issues. With the continuous development of cloud computing based on QEMU/KVM virtualization software, its security problems have garnered significant attention in recent years.

  • Brief Introduction to Virtualization Concepts

The main idea of virtualization is to abstract complex and difficult-to-use underlying resources into simple and easy-to-use resources through layering, providing them for upper layers to use. Essentially, the development of computers is also a process of continuous development of virtualization. A simple example: high-level languages like Python and Java have detached from machine code and redefined their own instructions, which are interpreted and executed by virtual machines on various platforms, achieving complete cross-platform compatibility. Another example: the TCP/IP protocol stack model, where the network card device transmits binary data, and after abstraction through the network layer and transport layer, application programs do not need to deal directly with network packets; they only need to care about the interface at the top of the protocol stack and do not need to worry about other programs using the network card device. They only need to provide the data and address to be sent to the protocol stack, which will automatically handle details like IP routing and fragmentation.Each process in a computer is an abstraction of the computer; each process believes it exclusively owns the resources of the entire computer system, but this is an illusion created by the operating system. We can understand a process as a kind of virtual machine, where the operating system allows the process to use the physical resources of the entire computer through underlying resources.Exploring and Reflecting on the Attack Surface of QEMU Virtualization SecurityEmulators are another form of virtual machines; for example, an emulator can interpret ARM program instructions as equivalent instructions for x86 to run directly on the hardware CPU. Typical emulators include QEMU, Bochs, etc.

Exploring and Reflecting on the Attack Surface of QEMU Virtualization Security

High-level language virtual machines take it a step further from emulators; for instance, languages like Python and Java have their compilers convert code into bytecode, which is a custom instruction set. Any platform that wants to run these programs must install the corresponding virtual machine. Common high-level language virtual machines include Java’s JVM and Python virtual machine. The instructions of these virtual machines are public, and anyone can write decompilation tools based on these instructions. If modifications are made to the public instructions, the public virtual machine will be unable to decode them, requiring decoding based on the custom instruction set. This is the principle of virtual machine protection in software protection. Exploring and Reflecting on the Attack Surface of QEMU Virtualization SecurityProcesses, emulators, and high-level language virtual machines provide execution environments for instructions, while system virtual machines provide a complete system environment by virtualizing CPU, memory, peripherals, etc., and can achieve complete isolation. The software that manages global physical resources is called a Virtual Machine Monitor (VMM). It allocates hardware resources among various virtual machines using time division multiplexing or space division multiplexing. Typical virtualization solutions include VMware, QEMU, Virtual Box, HyperV, etc. Exploring and Reflecting on the Attack Surface of QEMU Virtualization Security

  • QEMU Virtualization Security

The goal of virtualization security is to ensure that each virtual machine is independent and that the security boundaries between virtual machines and between virtual machines and the host are not compromised. The security design of virtualization platforms encompasses many aspects, such as virtual machines reading data from the host’s memory, virtual machines consuming host resources without restrictions, executing special instructions within virtual machines that lead to host crashes, and ordinary users within virtual machines elevating privileges through the virtualization platform. Of course, the most severe issue is virtual machine escape, where a user within a virtual machine executes a piece of code that exploits vulnerabilities in the virtualization platform to execute code on the host.The virtualization platform is a crucial foundational component in cloud computing, running numerous untrusted tenant virtual machines. Tenants can execute arbitrary code within virtual machines, so ensuring the security of the virtualization platform is a vital task for cloud computing platforms.Most security vulnerabilities are caused by untrusted inputs, which are also referred to as attack surfaces. The inputs to QEMU generally come from two sources: inside the virtual machine and outside the virtual machine.

Exploring and Reflecting on the Attack Surface of QEMU Virtualization Security

Attacks from inside the virtual machine refer to malicious requests executed by users within the virtual machine, exploiting vulnerabilities in QEMU/KVM to attack QEMU. Users inside the virtual machine can execute arbitrary code and access any virtual devices and virtual physical resources, so from QEMU’s perspective, all request data from inside the virtual machine is untrusted. Generally, attacks from inside the virtual machine include executing special instructions, reading special registers, and interacting with simulated devices. Device simulation is the largest attack surface, and most security vulnerabilities in QEMU are caused by device simulation. Device simulation includes traditional full virtualization device simulation as well as semi-virtualized virtio and vhost device simulation.Another significant source of internal attacks is agents within the virtual machine. Typically, these agents are used to facilitate communication between the virtual machine and the host, such as establishing special channels for sharing the clipboard between the virtual machine and the host. Virtualization software often needs to handle a large amount of input from inside the virtual machine, and if integrity security checks are not performed when processing these input requests, vulnerabilities can easily arise, allowing users inside the virtual machine to exploit these vulnerabilities to attack the virtualization software. The agent functions on commercial virtualization software are relatively complex, such as VirtualBox’s Guest Addition and VMware’s vmtools; the agent functions on QEMU virtual machines are simpler, so vulnerabilities are significantly fewer compared to VirtualBox and VMware.External attacks refer to malicious data constructed by other components interacting with QEMU, exploiting vulnerabilities in QEMU to attack it. There are many such external components, such as VNC and SPICE, which can remotely connect to QEMU. Both VNC and SPICE are remote desktop protocols; for example, QEMU has a built-in VNC server that allows various remote VNC clients to connect, enabling remote users to access the virtual machine.Data exchanges occur between QEMU’s VNC server and untrusted clients, such as VNC protocol and transmission of mouse and keyboard data. If QEMU does not perform proper security checks during this process, security vulnerabilities may arise, allowing remote clients to attack QEMU by constructing malicious data. Another aspect of external attacks involves files accessed by the virtual machine, such as the images used by the virtual machine. QEMU supports various image formats, and when opening an image, various parsing operations are performed. If the parsing code does not perform integrity security checks, vulnerabilities may arise, allowing users to trigger parsing vulnerabilities by constructing malicious images to attack QEMU.

  • QEMU Attack Surface

QEMU security vulnerabilities can be broadly categorized into two aspects: device simulation vulnerabilities and external vulnerabilities.

  • Device Simulation Vulnerabilities

Device simulation vulnerabilities are the most common vulnerabilities encountered in virtualization platforms to date. QEMU, VirtualBox, VMware, and other virtualization software have experienced numerous vulnerabilities caused by device simulation. The abundance of virtual device vulnerabilities is partly due to the need for virtualization platforms to simulate a large number of devices to present a complete hardware platform to the virtual machine, and partly due to the extensive interaction required between the virtual machine and virtual devices. If the virtualization software does not perform integrity security checks on the data passed from the virtual machine, security vulnerabilities may arise.The interaction between the virtual machine and virtual devices is completed through I/O ports or PCI device MMIO used by the hardware. QEMU simulates the entire hardware system and various devices for the virtual machine at startup, specifying the I/O ports and MMIO required by each device. When SeaBIOS starts, it allocates specific resources for the devices, such as the I/O ports or MMIO addresses used by specific devices. When the virtual machine kernel starts, the hardware driver scans the devices and loads the corresponding drivers for them, allowing the virtual machine to perform read and write access to these ports or MMIO address spaces.Each device has callback functions for reading and writing I/O ports or MMIO address spaces. Whenever the operating system inside the virtual machine reads or writes to these areas, the virtual machine generates a VM Exit, trapping into KVM, which then dispatches these requests to QEMU. QEMU calls these callback functions to complete the virtual machine’s simulated access to the devices, updating the state of the virtual devices, and QEMU may also access actual physical devices.Due to the large number of personnel involved in writing simulated devices and the complexity of most device interfaces, QEMU often fails to perform complete security checks on request data when handling these read and write requests, leading to numerous security issues. The data attack flow is illustrated in the following diagram:

Exploring and Reflecting on the Attack Surface of QEMU Virtualization Security

Next, we will analyze a specific example of a vulnerability in simulated devices. Here we analyze a vulnerability introduced in QEMU 3.1. The pm_smbus_init function initializes the simulation of SMBus, registering an MMIO address space with the address space MemoryRegionOps as pm_smbus_ops, as shown in the code below:

// qemu-3.1.0-rc4/hw/i2c/pm_smbus.cstatic const MemoryRegionOps pm_smbus_ops = { .read = smb_ioport_readb, .write = smb_ioport_writeb, .valid.min_access_size = 1, .valid.max_access_size = 1, .endianness = DEVICE_LITTLE_ENDIAN,};void pm_smbus_init(DeviceState *parent, PMSMBus *smb, bool force_aux_blk){ smb->op_done = true; smb->reset = pm_smbus_reset; smb->smbus = i2c_init_bus(parent, “i2c”); if (force_aux_blk) { smb->smb_auxctl |= AUX_BLK; } memory_region_init_io(&smb->io, OBJECT(parent), &pm_smbus_ops, smb, “pm-smbus”, 64);}

When the virtual machine performs read and write operations on this address space, it will call either smb_ioport_readb or smb_ioport_writeb functions accordingly; here we analyze the latter.

// qemu-3.1.0-rc4/hw/i2c/pm_smbus.cstatic void smb_ioport_writeb(void *opaque, hwaddr addr, uint64_t val, unsigned width){ PMSMBus *s = opaque;

SMBUS_DPRINTF(“SMB writeb port=0x%04” HWADDR_PRIx ” val=0x%02″ PRIx64 “\n”, addr, val); switch(addr) { case SMBHSTSTS: s->smb_stat &= ~(val & ~STS_HOST_BUSY); if (!s->op_done && !(s->smb_auxctl & AUX_BLK)) { uint8_t read = s->smb_addr & 0x01; s->smb_index++; if (!read && s->smb_index == s->smb_data0) { …… } else if (!read) { s->smb_data[s->smb_index] = s->smb_blkdata; s->smb_stat |= STS_BYTE_DONE; } else if (s->smb_ctl & CTL_LAST_BYTE) { s->op_done = true; s->smb_blkdata = s->smb_data[s->smb_index]; s->smb_index = 0; s->smb_stat |= STS_INTR; s->smb_stat &= ~STS_HOST_BUSY; } else { s->smb_blkdata = s->smb_data[s->smb_index]; s->smb_stat |= STS_BYTE_DONE; } } break; …… default: break; } out: if (s->set_irq) { s->set_irq(s, smb_irq_value(s)); }}

When processing the command for SMBHSTSTS sent from the virtual machine, under certain conditions (which can be controlled by the user inside the virtual machine), the s->smb_index will be incremented, but there is no security check performed on it. If a malicious user inside the virtual machine continuously increases this value, it can become a very large number. In the simulation of SMBus, smb_index is used to index the s->smb_data array, which has a size of only PM_SMBUS_MAX_MSB_SIZE(32) bytes. Therefore, if smb_index exceeds 32, it will access memory beyond the smb_data array definition. After processing the SMBHSTSTS command, it can be seen that there are both read and write operations to the address s->smb_data[s->smb_index]. Thus, this vulnerability allows the virtual machine to perform arbitrary read and write operations on the memory following the s->smb_data address space in QEMU. Since smb_index is a 32-bit data type, theoretically, the address space that can be read and written is 4GB. By performing out-of-bounds reads, one can obtain information about QEMU’s memory address distribution, bypass ASLR, and control the PC pointer of the QEMU process through writes, achieving control over the code flow.In fact, this vulnerability can achieve perfect virtual machine escape. Compared to the 2015 venom vulnerability, it is merely a different type of out-of-bounds write; this vulnerability can stably read QEMU data, thus bypassing ASLR. However, since this vulnerability was introduced in the very early version of QEMU 3.1 and was patched at the last moment before the release of 3.1, it does not exist in any released version of QEMU. Nevertheless, it is undoubtedly one of the most severe vulnerabilities in QEMU’s history.The fix is quite simple; it only requires a length check after the increment.

if(s->smb_index >= PM_SMBUS_MAX_MSG_SIZE){ s->smb_index = 0;}
  • External Vulnerabilities

Although device simulation vulnerabilities account for the majority of all QEMU vulnerabilities, QEMU also has other attack surfaces, such as VNC. VNC is a remote desktop sharing system based on the RFB protocol. QEMU has a built-in VNC server to receive client requests, and all VNC clients can connect to QEMU virtual machines. Clients transmit mouse and keyboard control information to the server through the RFB protocol, and the server uses this information to update the corresponding information of the virtual machine. If there are vulnerabilities in the QEMU server when processing data sent by clients, malicious clients may send constructed data to trigger these vulnerabilities, leading to attacks on QEMU.Exploring and Reflecting on the Attack Surface of QEMU Virtualization SecurityCVE-2015-8504 is a typical example of this type of vulnerability. This vulnerability occurs when QEMU processes the <span>SetPixelFormat</span> message, and the code that generates the vulnerability is as follows.

// qemu-2.4.0/ui/vnc.cstatic void set_pixel_format(VncState *vs,int bits_per_pixel, int depth,int big_endian_flag, int true_color_flag,int red_max, int green_max, int blue_max,int red_shift, int green_shift, int blue_shift){ …… vs->client_pf.rmax = red_max; vs->client_pf.rbits = hweight_long(red_max); vs->client_pf.rshift = red_shift; vs->client_pf.rmask = red_max << red_shift; vs->client_pf.gmax = green_max; vs->client_pf.gbits = hweight_long(green_max); vs->client_pf.gshift = green_shift; vs->client_pf.gmask = green_max << green_shift; vs->client_pf.bmax = blue_max; ……} // qemu-2.4.0/ui/vnc-enc-tight.cstatic void write_png_palette(int idx, uint32_t pix, void *opaque){ …… if (vs->tight.pixel24) { …… } else { …… color->red = ((red * 255 + vs->client_pf.rmax / 2) / vs->client_pf.rmax); color->green = ((green * 255 + vs->client_pf.gmax / 2) / vs->client_pf.gmax); color->blue = ((blue * 255 + vs->client_pf.bmax / 2) / vs->client_pf.bmax); }}

<span>set_pixel_format</span> function is used to handle the VNC_MSG_CLIENT_SET_PIXEL_FORMAT message, and its parameters red_max, green_max, and blue_max come from the remote client and can be arbitrary values. These values are assigned to the corresponding members of vs->client_pf. In the subsequent call to write_png_palette, these values are used as divisors. If a malicious remote client sets red_max to 0, a division by zero error will occur in the write_png_palette function, causing the virtual machine to crash.The fix for this vulnerability is also relatively simple; it only requires changing 0 to 0xff during assignment.

vs->client_pf.rmax = red_max ? read_max : 0xff; vs->client_pf.gmax = green_max ? green_max : 0xff; vs->client_pf.bmax = blue_max ? blue_max : 0xff;

Other remote desktop protocols also have similar issues, such as the CVE-2016-9578 vulnerability in SPICE, which is an integer overflow vulnerability caused by insufficient client validation.

  • Reflections on QEMU Security

Since the exposure of the venom vulnerability in QEMU in 2015 and the subsequent inclusion of virtualization software projects like VirtualBox, VMware, and HyperV in various security competitions, more and more people have begun to conduct systematic and in-depth research on virtualization security, leading to the discovery of vulnerabilities in an increasing number of virtualization platforms. From the previous examples of vulnerabilities, it can be seen that the greatest threats and the most vulnerabilities in QEMU lie in device simulation. For other virtualization software, device simulation is also a hotspot for vulnerabilities. The reasons for device simulation vulnerabilities are similar; fundamentally, they all stem from the lack of integrity security checks on requests from within the virtual machine. A device that has vulnerabilities in one virtualization software may also have similar vulnerabilities in another virtualization software. For example, the e1000 network card device is the default network card used by both QEMU and VirtualBox, and studying the e1000 network card simulation implementation under QEMU is also beneficial for researching its implementation under VirtualBox.To fundamentally address security vulnerabilities in virtualization software, several approaches can be taken:Reduce the attack surface. On one hand, some device simulations can be moved into the kernel, where the code undergoes strict security audits; for instance, the APIC and I/O APIC interrupt controllers are now simulated in the kernel by default. On the other hand, efforts should be made to minimize the use of simulated devices.Utilize new, lightweight virtualization solutions. When KVM was first introduced, choosing QEMU as its application layer software seemed reasonable at the time. However, currently, due to its historical reasons, QEMU aims to provide comprehensive simulation across various platforms, not limited to cloud computing, which brings some “burdens.” Many vendors are working to reduce these “burdens”; for example, Intel has proposed nemu, which aims to be a cloud computing-specific virtualization platform. It has been trimmed from QEMU, removing simulations of platforms other than x86 and ARM architectures, as well as some less commonly used device simulations, while retaining only the most basic and essential functionalities.Employ modern memory-safe programming languages like Rust. Currently, various virtualization platforms are implemented using C/C++, which are prone to security vulnerabilities. The development of virtualization software using memory-safe programming languages is gradually gaining traction; for instance, Google’s crosvm is a virtualization software platform created using Rust, and Amazon has combined the two approaches to develop Firecracker based on crosvm.

  • Reference

“QEMU/KVM Source Code Analysis and Application” — Edited by Li Qiang

Leave a Comment