Introduction to IoT Firmware Vulnerability Research

With the advent of the 5G era, the role of the Internet of Things (IoT) is becoming increasingly important, along with more security risks. IoT security covers a wide range of topics. This series of articles will discuss the author’s understanding of IoT vulnerability research from a technical perspective. The author will explore five dimensions: firmware, web, hardware, IoT protocols, and mobile applications. Due to limited capabilities, any inaccuracies or omissions are welcome for correction and supplementation.

Basics of IoT Firmware

The reason for choosing firmware as the first topic of discussion is that it is relatively fundamental, and IoT vulnerability research generally cannot bypass it. The following will introduce four parts: firmware decryption (if encrypted), unpacking and repacking, simulation, and overall security assessment of firmware.

1.1 Firmware Decryption

Some IoT devices encrypt or even sign firmware to increase research thresholds and security during upgrades. Since encryption and decryption are resource-intensive, such devices generally have higher configurations, such as some routers and firewalls.

1.1.1 Determining Firmware Encryption

Determining whether firmware is encrypted is relatively simple. Experienced individuals can use a binary editor to identify certain characteristics. Generally, the following features may exist.

Except for the firmware header, there are no visible characters in the data (excluding the header), the bit frequency of the data is basically consistent, and binwalk (-e) cannot parse the firmware structure, and (-A) does not recognize any CPU architecture instructions.

If the above characteristics are met, it can be suspected that the firmware is encrypted. Firmware decryption generally starts from these angles, but is not limited to the following methods.

1.1.2 Obtaining Keys from Hardware

This method is limited to firmware existing in an encrypted state at all times, where the system only decrypts and unpacks it into flash during startup, and the device lacks dynamic debugging means (UART/JTAG, etc.). Since the complete decryption process is in flash, a programmer can read the flash, reverse engineer the decryption algorithm and key, and achieve the purpose of decrypting the firmware. For example, the readout of flash memory from a certain device is distributed as follows:

0x000000-0x020000 boot section
0x020000-0x070000 encrypt section
0x070000-0x200000 encrypt section
0x200000-0x400000 config section

Clearly, the encryption process we need is in the boot section, where we need to find the encryption algorithm and key. Generally, encryption uses public block algorithms like AES, and the key is to find the block mode, IV (non-ECB), and key. Loading the boot into IDA Pro did not automatically recognize: Introduction to IoT Firmware Vulnerability Research The structure of the interrupt vector table at the beginning of the ARM code can be manually identified. The common entry code is as follows:

.globl _start
_start:
    b       reset
    ldr     pc, _undefined_instruction
    ldr     pc, _software_interrupt
    ldr     pc, _prefetch_abort
    ldr     pc, _data_abort
    ldr     pc, _not_used
    ldr     pc, _irq
    ldr     pc, _fiq
...
_irq:
        .word irq

After this, the reverse engineering can reveal that the encryption algorithm is AES, and the key is obtained through the SHA256 hash of the device’s serial number. Introduction to IoT Firmware Vulnerability Research The structure recognized by IDA Pro will be discussed later when introducing RTOS. Devices using this firmware encryption method have a high level of security, and generally only decrypt during upgrades for verification.

1.1.3 Debugging with Direct Read

This method is the easiest to understand, which is to use UART, JTAG, Console, or network means to send back the firmware (packed) after the device starts, thus bypassing the decryption stage. It is worth noting that the device must provide these interfaces, and the specific methods vary by device. The use of these interfaces will be introduced in the hardware section.

1.1.4 Comparing Boundary Versions

This method is applicable when the manufacturer initially did not use encryption, i.e., the old firmware was unencrypted, and a decryption program was added during a certain upgrade, subsequently using encrypted firmware for upgrades. This way, we can find the boundary version between encrypted and unencrypted firmware from a series of firmware, unpack the last unencrypted version to reverse engineer the upgrade program, and restore the encryption process. Introduction to IoT Firmware Vulnerability Research By downloading the firmware of a certain router as shown in the picture, unpacking it, and searching for keywords like “firmware,” “upgrade,” “update,” “download,” etc. to locate the upgrade program. Of course, if debugging means are available, we can also use ps to view the process during the upgrade to locate the upgrade program and parameters:

/usr/sbin/encimg -d -i <fw_path> -s <image_sign>

By reverse engineering the encimg program with IDA Pro, we quickly obtain the code for the encryption and decryption process, which uses the AES CBC mode:

AES_set_decrypt_key (
   // user input key
   const unsigned char *userKey,
   // size of key
   const int bits,
   // encryption key struct which will be used by
   // encryption function
   AES_KEY *key
)

AES_cbc_encrypt (
   // input buffer
   const unsigned char *in,
   // output buffer
   unsigned char *out,
   // buffer length
   size_t length,
   // key struct return by previous function
   const AES_KEY *key,
   // initializatin vector
   unsigned char *ivec,
   // is encryption or decryption
   const int enc
)

1.1.5 Reverse Engineering the Upgrade Program

This method is applicable when the upgrade program has been obtained through interfaces or boundary versions. A tool for detecting encryption algorithms and locating positions can be used with a block algorithm’s box detection tool. Of course, binwalk can also parse certain simple cases, such as certain industrial control HMI firmware:

iot@attifyos ~/Documents> binwalk hmis.tar.gz
DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
34       0x22        OpenSSL encyption, salted, salt:0x5879382A7

By loading the upgrade program directly and locating the OpenSSL call, it is easy to obtain the decryption command:

Introduction to IoT Firmware Vulnerability Research1.1.6 Exploiting Vulnerabilities to Obtain Keys

If boundary versions cannot be found, and debugging interfaces are unavailable or hardware debugging is unfamiliar, one can consider using historical version vulnerabilities to gain control over the device, then obtain the upgrade program to reverse engineer the encryption algorithm. This method is somewhat opportunistic, requiring the device to have a historical firmware with RCE vulnerabilities. By downgrading, one can implant vulnerabilities to gain permissions, download the required upgrade program, and then reverse engineer the encryption algorithm.

1.2 Firmware Unpacking

Beginners in IoT security research may find firmware unpacking simple, as it can be done directly with binwalk -Me, but the reality is often more complex. After extensive firmware testing, one will find that binwalk fails to unpack in many cases. IoT firmware generally falls into two categories: one with a file system, mostly based on Linux/BSD, and the other is an integrated firmware, which we refer to as RTOS (Real-time operating system).

1.2.1 Firmware with File System

Binwalk is well known, and using binwalk can directly obtain the rootfs file system, which will not be elaborated here. The author believes that the power of binwalk lies in its ability to parse and identify multiple header formats, providing references for unpacking. The following are a few situations that require some detours. Of course, firmware varies widely, depending on the designer’s design, and cannot be listed exhaustively.

1.2.1.1 UBI (Unsorted Block Image)

UBI format firmware is relatively common, and binwalk cannot directly unpack it. However, there are ready-made tools online like ubi_reader. One point to note:

For UBI_reader unpacking, the UBI file must be a multiple of 1024 bytes, and content must be added or removed to align.

For example, by analyzing a certain router, it was found that its rootfs is in UBI format:

# binwalk ROM/wifi_firmware_c91ea_1.0.50.bin
DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
684           0x2AC           UBI erase count header, version: 1, EC: 0x0, VID header offset: 0x800, data offset: 0x1000

First install ubi_reader:

$ sudo apt-get install liblzo2-dev
$ sudo pip install python-lzo
$ git clone https://github.com/jrspruitt/ubi_reader
$ cd ubi_reader
$ sudo python setup.py install

Or directly:

$ sudo pip install ubi_reader

Then extract the UBI structure according to the address, using ubireader_extract_files [options] path/to/file to unpack.

1.2.1.2 PFS

Some firmware can have headers recognized by binwalk, but cannot be unpacked. For example, the following firmware:

iot@attifyos ~/Documents> binwalk -Me v2912_389.all

Scan Time:     2020-11-04 18:39:13
Target File:   /home/iot/Documents/v2912_389.all
MD5 Checksum:  180c60197aae7e272191695e906c941e
Signatures:    396

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
1546799       0x179A2F        gzip compressed data, last modified: 2042-04-26 20:13:56 (bogus date)
1717744       0x1A35F0        LZ4 compressed data
4171513       0x3FA6F9        SHA256 hash constants, little endian
4179098       0x3FC59A        Copyright string: "Copyright (c) 1998-2000 by XXXXX Corp."
4214532       0x404F04        Base64 standard index table
4224780       0x40770C        HTML document header
4232369       0x4094B1        SHA256 hash constants, little endian
4307839       0x41BB7F        SHA256 hash constants, little endian
4314017       0x41D3A1        XML document, version: "1.0"
4702230       0x47C016        Base64 standard index table
4707197       0x47D37D        Certificate in DER format (x509 v3), header length: 4, sequence length: 873
4727609       0x482339        Base64 standard index table
4791281       0x491BF1        PFS filesystem, version 1.0, 12886 files
4807401       0x495AE9        Base64 standard index table
...
iot@attifyos ~/Documents> ls _v2912_389.all.extracted/pfs-root/000/
WEBLOGIN.HTM  _WEBLOGIN.HTM.extracted/

After running binwalk and checking the results, it is found that nothing recognizable was discovered. At this point, one can manually analyze or search for some related tools. Online, relevant tools can be found, and using the command according to the prompts can unpack the firmware.

iot@attifyos ~/D/draytools> python draytools.py -F v2910_61252.all
v2910_61252.all.out written, 12816484 [0x00C39064] bytes
FS extracted to [/home/iot/Documents/draytools/fs_out], 429 files extracted

Here is a brief look at the key code for firmware unpacking. The key is to find headers like ‘
\xA5\xA5\xA5\x5A\xA5\x5A’ and then unpack and decompress according to the specific format. Thus, firmware unpacking ultimately boils down to data format analysis.

def decompress_firmware(data):
    flen = len(data)
    sigstart = data.find('\xA5\xA5\xA5\x5A\xA5\x5A')
    if sigstart <= 0:
        sigstart = data.find('\x5A\x5A\xA5\x5A\xA5\x5A')
    if sigstart > 0:
        if draytools.verbose:
            print 'Signature found at [0x%08X]' % sigstart
        lzosizestart = sigstart + 6
        lzostart = lzosizestart + 4
        lzosize = unpack('>L', data[lzosizestart:lzostart])[0]
        return data[0x100:sigstart+2] \
            + pydelzo.decompress('\xF0' + pack(">L",0x1000000) \
                + data[lzostart:lzostart+lzosize])
    ...

1.2.1.3 Openwrt Lua

Parsing Lua structures may not be entirely appropriate here, but given the large user base of Openwrt, it is briefly mentioned. Lua is a lightweight scripting language that is easy to embed and extend, used in Openwrt development. It is worth noting that some devices’ Lua are not plain text and are obfuscated, requiring the use of luadec for decompilation. The Lua scripts in Openwrt differ slightly from those compiled with traditional luajit, and several patches are needed for luadec to work correctly. The commands are as follows:

$ cd ..
$ mkdir luadec
$ cd luadec/
$ git clone https://github.com/viruscamp/luadec
$ cd luadec/
$ git submodule update --init lua-5.1
$ cd lua-5.1
$ make linux
$ make clean
$ mkdir patch
$ cd patch/
$ get https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/010-lua-5.1.3-lnum-full-260308.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/030-archindependent-bytecode.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/011-lnum-use-double.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/015-lnum-ppc-compat.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/020-shared_liblua.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/040-use-symbolic-functions.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/050-honor-cflags.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/100-no_readline.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/200-lua-path.patch
$ wget https://dev.openwrt.org/export/HEAD/trunk/package/utils/lua/patches/300-opcode_performance.patch
$ mv patch/ patches
$ for i in ../patches/*.patch; do patch -p1 <$i ; done
$ for i in ./patches/*.patch; do patch -p1 <$i ; done
$ make linux

Modify lua-5.1/src/MakeFile:

# USE_READLINE=1
  +PKG_VERSION = 5.1.5
  -CFLAGS= -O2 -Wall $(MYCFLAGS)
  +CFLAGS= -fPIC -O2 -Wall $(MYCFLAGS)
  - $(CC) -o $@ -L. -llua $(MYLDFLAGS) $(LUA_O) $(LIBS)
  + $(CC) -o $@ $(LUA_O) $(MYLDFLAGS) -L. -llua $(LIBS)
  - $(CC) -o $@ -L. -llua $(MYLDFLAGS) $(LUAC_O) $(LIBS)
  + $(CC) -o $@ $(LUAC_O) $(MYLDFLAGS) -L. -llua $(LIBS)

Then execute:

$ make linux
 $ ldconfig
 $ cd ../luadec
 $ make LUAVER=5.1
 $ sudo cp luadec /usr/local/bin/

Using luadec to display the code structure:

$ luadec -pn squashfs-root/usr/lib/lua/luci/sgi/uhttpd.lua
0
  0_0
    0_0_0
    0_0_1
    0_0_2

It is important to note that luadec compilation is architecture-dependent, and the official luadec cannot parse Lua files under the ARM environment. However, there are relevant tools available online, which will not be elaborated here.

1.2.2 RTOS

Many IoT devices adopt RTOS (Real-time Operating System) architecture, where the firmware itself is an executable file and does not contain a file system, loading and running directly upon startup. The most important points for RTOS analysis are:

(1) Firmware program entry (2) Firmware program symbols

1.2.2.1 VxWorks

First, let’s start with VxWorks, which is widely used and has recognizable patterns. VxWorks is a real-time operating system launched by Wind River Systems, widely used in communication, military, aerospace, and embedded devices. Due to its standards, it is easy to identify. The following is an example of such firmware:

iot@attifyos ~/Documents> binwalk image_vx5.bin

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
335280        0x51DB0         PEM certificate
...
3721556       0x38C954        GIF image data, version "89a", 10 x 210
8518936       0x81FD18        VxWorks operating system version "5.5.1" , compiled: "Mar  5 2015, 15:56:18"
9736988       0x94931C        SHA256 hash constants, little endian
...
13374599      0xCC1487        Copyright string: "Copyright  1999-2001 Wind River Systems."
13387388      0xCC567C        VxWorks symbol table, big endian, first entry: [type: function, code address: 0xF4A09A00, symbol address: 0xF813C800]
13391405      0xCC562D        VxWorks symbol table, little endian, first entry: [type: function, code address: 0xB8BD, symbol address: 0xD000C800]

Binwalk has already identified the firmware as Vxworks 5.5.1 and provided the symbol table location. First, we need to identify the firmware entry point. If the firmware is packaged in ELF format, we can directly use readelf to obtain the base address. Here, it is clearly not applicable.

iot@attifyos ~/Documents> readelf -a image_vx5_arm_little_eniadn.bin
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
iot@attifyos ~/Documents> binwalk -A image_vx5.bin |more

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
244           0xF4            ARM instructions, function prologue
408           0x198           ARM instructions, function prologue
440           0x1B8           ARM instructions, function prologue
472           0x1D8           ARM instructions, function prologue
608           0x260           ARM instructions, function prologue

Using binwalk -A we find that the firmware architecture is ARM, then directly load it into IDA Pro: Introduction to IoT Firmware Vulnerability ResearchIntroduction to IoT Firmware Vulnerability Research Analyzing the firmware’s initial jump determines that the loading address is 0x1000. For VxWorks, common methods to determine the base address include:

Analyzing the initialization code at the firmware header to find the first function usrInit of VxWorks, and finding the BSS boundary based on BSS initialization characteristics to calculate the firmware loading address.

Then, according to the position indicated by binwalk, repair the symbol table name. Introduction to IoT Firmware Vulnerability Research The function table stores the function names and function addresses, and by locating both, we can also verify the correctness of the base address. For example, the function name shown in the image at 0x00c813f8 is: Introduction to IoT Firmware Vulnerability Research and the function address is 0x009aa0f4: Introduction to IoT Firmware Vulnerability Research Since the base address is architecture-dependent, it will not be elaborated here. For VxWorks analysis, we can use a plugin that can automate the repair of entries and symbols—vxhunter. Taking Ghidra as an example, after loading the firmware, directly select the vxhunter_firmware_init.py plugin and the version of VxWorks to automatically repair the entries and symbols: Introduction to IoT Firmware Vulnerability ResearchIntroduction to IoT Firmware Vulnerability Research 1.2.2.2 U-boot

Boot-type firmware is also a common type of firmware without a file system. For example, many IoT devices use U-boot as the bootloader. Since U-boot is open-source, we can analyze it based on the source code. For some architectures, U-boot can also follow fixed patterns, such as MIPS based on the $gp register, etc.

Introduction to IoT Firmware Vulnerability Research 1.2.2.3 Chip Firmware

Some IoT firmware lacks documentation, making reverse engineering difficult. For example, the firmware of a certain ARM chip, when loaded into IDA Pro, shows no recognized functions: Introduction to IoT Firmware Vulnerability Research Thus, we need to conduct an overall analysis of the firmware. We see that the position at 0x100 in the firmware is quite interesting: Introduction to IoT Firmware Vulnerability Research After arranging in 4-byte order, they all start with 0x2. This is neither code nor data, so it is very likely an address. This should be a table, so the base address is likely 0x200000. After rebasing, we can check the strings: Introduction to IoT Firmware Vulnerability Research We see many strings resembling function names. After finding the specific location, we can conduct a binary search in the firmware for the address 0x16852A, which is wlc_probresp_attach address (little-endian). Introduction to IoT Firmware Vulnerability Research We can see that we have indeed found it, and it is also a table structure: Introduction to IoT Firmware Vulnerability Research By locating the base address in IDA Pro: Introduction to IoT Firmware Vulnerability Research We can see that some cross-references have been completed. Further analysis is quite complex, and will not be elaborated here. In fact, the 0x100 position is a function address table, and there are many such tables in this firmware.

1.3 Firmware Repacking

Unpacking is easy, but repacking is difficult. This principle also applies to firmware repacking. If the device has debugging interfaces, generally no repacking operation is needed, as security research is primarily based on reverse thinking. Sometimes, lacking debugging means, we need to manually add in the unpacked firmware. Generally, we put cross-compiled telnetd, dropbear (sshd), gdb into the firmware file and replace the startup script for packaging. There are many startup script patterns in Linux, especially in IoT devices. The author generally adopts a relatively clever method, such as confirming that /sbin/xxxd service will run at startup, we can replace it:

# mv rootfs/sbin/xxxd sbin/xxxdd
# touch rootfs/sbin/xxxd
# chmod +x rootfs/sbin/xxxd

Then add to sbin/xxxd:

#!/bin/sh

/usr/sbin/telnetd -F -l /bin/sh -p 1234 &/
/sbin/xxxdd &

Thus, when starting xxxd, telnetd will run first.

1.3.1 Cross-compilation

If we can package from a forward development perspective, it is certainly the most convenient, which is the matter of cross-compilation. In some devices I have researched, mainly router firmware partially adheres to GPL, which means that some code software is open-sourced (generally based on open-source tools), and provides the remaining software in binary form along with the entire firmware packaging tool (method). For example, a certain router device I researched previously provided open-source downloads: Introduction to IoT Firmware Vulnerability Research Downloading this zip package, we compile rootfs according to our needs, and finally use the tools provided in the zip package for packaging:

./packet -k %s -f rootfs -b compatible_r6400.txt 
        -ok kernel -oall image -or rootfs -i ambitCfg.h

1.3.2 Firmware-mod-kit

Firmware-mod-kit (fmk) may be the most commonly used unpacking and repacking tool based on binwalk. However, since it has not been updated for a long time, its usage scenarios are limited. The installation and usage of fmk are relatively simple, as follows:

# For ubuntu
$ sudo apt-get install git build-essential zlib1g-dev liblzma-dev python-magic bsdmainutils autoconf
# For redhat/centos
$ yum groupinstall "Development Tools"
$ yum install git zlib1g-dev xz-devel python-magic zlib-devel util-linux
# Usage
$ ./extract-firmware.sh firmware.bin // unpack
$ cp new-telnetd fmk/rootfs/usr/sbin/telnetd // modify as needed
$ ./build-firmware.sh // repack

1.3.3 Manual Analysis

The difficulty of repacking lies in ensuring that the firmware matches the original firmware and passes various checks; otherwise, it may fail to flash lightly or brick the device severely. The author previously wrote an article about the Netgear UPnP vulnerability, which involves the Netgear firmware packing process. Interested readers can take a look. Firmware is generally divided into many sections, and for convenience of parsing, each section has an indicator header, which may store flags, sizes, and CRC checksums, etc. These pieces of information provide the basis for unpacking. For example, one can first obtain the firmware size (in hexadecimal), split the bytes based on the firmware size, generally 4 bytes, and then search for similar bytes on the firmware header (the indicator length on the firmware header will be reduced by the header length). Then analyze from the bytes indicated by size to clarify the format, which is very similar to analyzing network protocols. Introduction to IoT Firmware Vulnerability Research Of course, most headers have standards, and one can correspond to them according to standard formats. It is worth noting that some manufacturers will sign the firmware, which increases the difficulty of repacking. In this case, we can look for official packing tools that comply with the GPL, or use OpenSSL to generate public and private keys to overwrite the verification public key in the device. Of course, there must be vulnerabilities; otherwise, it would fall into a chicken-and-egg situation. There is also a relatively good and cost-effective method—firmware simulation.

1.4 Firmware Simulation

Firmware simulation may have the following three scenarios depending on the need:

(1) Only need to simulate a certain application, such as web, upnpd, dnsmasq, etc., with the aim of debugging that application. In this case, one can directly run the program with simulation tools, only considering whether the dynamic libraries can load. (2) Need to simulate the firmware shell, interacting with the entire system. Here, one can use chroot to change the root path and use simulation tools to execute /bin/sh. At the same time, /proc can be mounted, so that it appears more realistic when viewing processes with ps. (3) Need to simulate the entire firmware startup, and ensure that network cards and other components can function normally. Here, one needs to use tools that can simulate the img system to directly load the whole system, or use the

Leave a Comment