
This article is a featured article from the Kanxue Forum.
Author ID on Kanxue Forum: LunaYoung
Talosec Core Team: Wang An, Yang Xiaoya
We conducted a security assessment of an open-source hardware wallet using side-channel analysis.
This wallet’s chip is based on the ARM-Cortex-M4 core, which uses the Elliptic Curve Digital Signature Algorithm (ECDSA) for signing.
With the source code available, we analyzed various methods of side-channel attacks on the ECDSA algorithm, identifying both the locations where side-channel protections have been implemented and areas that may be vulnerable to such attacks, and we attempted to validate our conclusions through experimentation.
Finally, we provided some improvement suggestions for side-channel protection. Our second report, Part 2, will conduct further attack testing on the improved scheme.
Due to the forum’s limitation of inputting mathematical formulas in text format, this report involves a large number of mathematical formulas, and we will upload the analysis report as a PDF attachment.
Introduction
1. Research Background and Significance
Unlike most traditional currencies, Bitcoin is a digital currency. Since Bitcoin does not exist in any physical shape or form, it cannot technically be stored anywhere.
During a transaction, both parties need a “Bitcoin wallet” similar to an email account and a “Bitcoin address” similar to an email address. Just like sending and receiving emails, the remitter can directly pay Bitcoin to the recipient’s address using a computer or smartphone.
A Bitcoin address and a private key appear in pairs; their relationship is akin to a bank card number and a password. The Bitcoin address serves as a record of how much Bitcoin you hold at that address.
You can generate Bitcoin addresses at will to store Bitcoin. Each Bitcoin address generates a corresponding private key upon creation. This private key can prove your ownership of the Bitcoin at that address.
We can simply understand a Bitcoin address as a bank card number, with the corresponding private key as the password for that bank card number. Only if you know the bank password can you use the funds associated with that bank card number.
Therefore, the private key is particularly important for a Bitcoin wallet.
Bitcoin wallets can be roughly divided into two types based on how they store private keys: hot wallets and cold wallets.
Hot wallets require an internet connection to use, while cold wallets are used offline, making it significantly harder for external parties to access private key storage via the internet, thus greatly reducing the risk of hacking.
Among cold wallets, hardware wallets are very popular. The private key is stored inside the hardware’s microprocessor, and transactions are confirmed within the hardware wallet, ensuring that even if the computer gets infected with a virus, the keys will not be leaked, providing high security.
Compared to other offline wallets, such as paper cold wallets, hardware wallets stand out for their convenience. They can connect to a computer via USB or Bluetooth, allowing transactions to be confirmed with the click of a button.
During operation, encrypted electronic devices may leak side-channel information such as timing, energy consumption, or electromagnetic radiation. The methods that utilize these leaks to attack encrypted devices are known as side-channel attacks.
Side-channel attack techniques are a hot research direction in international cryptography. They can directly obtain intermediate information from cryptographic computations through physical channels and can segment and recover longer keys, making them easier to attack actual cryptographic systems than traditional cryptanalysis.
Therefore, international mainstream cryptographic product evaluation agencies regard the ability to defend against side-channel attacks as a primary criterion for assessing the security of devices or chips. Scholars and hackers can use side-channel attacks to crack cryptographic modules or security products through methods such as power analysis, electromagnetic radiation attacks, fault attacks, mid-range electromagnetic and sound attacks, and cache attacks.
As a hardware device, a hardware Bitcoin wallet inevitably leaks some side information during computation. For example, during the signing process, the private key is used. If an attacker collects the power or electromagnetic side information at this time, there is a certain probability of obtaining the private key stored in the wallet.
Obtaining the private key essentially equates to completely cracking the electronic wallet, making side-channel analysis of Bitcoin wallets particularly important.
If we cannot prove that a hardware Bitcoin wallet is resistant to side-channel attacks, we have every reason to doubt the security of that wallet.
2. Current Research Status at Home and Abroad
Currently, there are various brands of hardware wallets on the market. Although some claim to have defenses against side-channel attacks, they have not disclosed details, so their security remains unknown without third-party evaluation.
On April 30 of this year, Riscure released a report on electromagnetic pulse fault injection experiments on KeepKey, which bypassed its PIN code authentication process and reset private key steps, allowing an attacker to gain access to the wallet’s private keys without entering a PIN.
We typically focus our side-channel analysis on different cryptographic algorithms. For the same algorithm, there are many common issues in its implementation across different hardware devices.
Since the signing algorithm in the Bitcoin wallet we are studying is ECDSA, which is based on ECC, there has been a significant accumulation of side-channel analysis research on ECC both domestically and internationally over the years, primarily involving power analysis and fault injection methods. Below we will introduce these two side-channel analysis methods in ECC.
In terms of power attacks, Coron pointed out in [2] that simple power analysis (SPA) can be performed on the scalar multiplication implementation in ECC.
For the Montgomery algorithm’s modular exponentiation, Herbst indicated in [3] that template attacks (TA) can be executed on this process. Attackers can first conduct multiple experiments on a fully controllable device to build a template, and then collect the energy consumption during encryption on the target device to match against the previously established template.
When computing scalar multiplication, i.e., kP, if k is fixed, the attacker can freely choose P. Thus, the attacker can select multiple values for P, allowing the device to perform encryption operations while collecting the energy consumption during the encryption process. In this case, correlation power analysis (CPA) can be used to recover k a few bits at a time, as mentioned in [4].
In reference [5], the authors proposed a method that lies between SPA and CPA—comparison method. Fouque et al. in [6] pointed out that for two doubling points 2P and 2Q, the attacker may not be able to determine the specific values of P and Q from the energy waveforms, but can learn whether P and Q are equal by comparing waveforms, thus providing an opportunity for the attacker to recover all bits of k by comparing the waveforms of kP and k(2P).
Additionally, in cases where P is selectable by the attacker, if the input P contains zero values (such as (x,0) or (y,0)), regardless of the randomization applied to P, it will always have a coordinate value of zero. During scalar multiplication, this zero value can be leveraged to obtain key information, a method referred to as RPA, proposed by Goubin in [7].
Zero-point value attacks (ZPA) [8] extend RPA; RPA uses special points with zero coordinates for energy attacks, while ZPA uses points with zero values in auxiliary registers during field operations for energy attacks.
Besides power analysis, common side-channel analysis methods also include fault injection, where attackers can use lasers, electromagnetic pulses, or power/clock glitches to induce faults in the attacked device, allowing attackers to obtain output values of interest. Readers can refer to [9] for specific methods.
In ECC, there are primarily three fault injection methods. One is safe-error analysis, a concept proposed by Yen and Joy in [10][11], which points out two types of safe-error analysis attack methods. The C-safe-error attack method refers to inducing temporary faults to determine whether the operation is redundant.
Another method is based on weak curves, proposed by Biehl et al. in [12]. Attackers can change the curve parameters a_6 through fault injection, obtaining weak curves with smaller orders, which allows the recovery of k values through brute force when kP is known.
The third method is differential fault attack (DFA), also proposed by Biehl et al. in [12]. Attackers perform encryption once without fault injection to obtain a correct result, then perform encryption again under fault injection to obtain an incorrect result, and compare the results to potentially gain sensitive information.
Background Knowledge
1. Elliptic Curve Cryptography
This article introduces some basic concepts of elliptic curves using a prime field as an example.
Let p>3 be a prime number, a,b∈F_P, satisfying 4a_3+27b_2≠0. The elliptic curve defined by a and b on F_P is the set of all solutions (x,y) to the equation y^2=x^3+ax+b, where x,y∈F_P, along with the element representing the point at infinity (denoted as O).
For all P(x,y)∈Fp, P+O=P. Let P_1 (x_1,y_1)≠O, P_2 (x_2,y_2)≠O be two points on the elliptic curve, and P_1≠-P_2, then P_1+P_2=P_3 (x_3,y_3), where in affine coordinates, the point addition and doubling on the elliptic curve are given by:
x_3=λ^2-x_1-x_2
y_3=λ(x_1-x_3)-y_1
Here, when P_1≠P_2, λ=(y_2-y_1)/(x_2-x_1); when P_1=P_2, λ=(3x^2+a)/(2y_1).
Q=kP is the basic operation of ECC (P and Q are points on the elliptic curve, k is an integer), referred to as point multiplication or scalar multiplication. Here, k is the private key; Q is the public key; P is a base point on the elliptic curve.
Knowing k and P makes it easy to compute Q; however, knowing P and Q makes it difficult to determine k. The security of ECC relies on this principle.
2. ECDSA
The parameter set D=(q,FR,S,a,b,P,n,h) for ECDSA must satisfy certain conditions.
Its key pair is also generated based on the parameter set. A random number d is selected from [1,n-1], and Q=dG is computed. Here, d is the private key, and Q is the public key.
The ECDSA signing algorithm is as follows:
Algorithm 1: ECDSA Signing
Input: Parameter set D=(q, FR, S, a, b, P, n, h), private key d, message m
Output: Signature (r,s)
1: Select k∈_R [1,n-1]
2: Compute kP=(x_1,y_1), and convert x_1 to integer (x_1 ) ̅
3: Compute r=(x_1 ) ̅mod n. If r=0, jump to step 1
4: Compute e=H(m)
5: Compute s=k^(-1) (e+dr)mod n. If s=0, jump to step 1
6: Return (r,s)
The ECDSA verification steps are as follows:
Algorithm 2: ECDSA Verification
Input: Parameter set D=(q, FR, S, a, b, P, n, h), public key Q, message m, signature (r,s)
Output: Determine if the signature is valid
1: Verify if r and s are integers in the interval [1,n-1]. If any verification fails, return ("Reject this signature")
2: Compute e=H(m)
3: Compute w=s^(-1) mod n
4: Compute u_1=ew mod n and u_2=rw mod n
5: Compute X=u_1 P+u_2 Q
6: If X=∞, return ("Reject this signature")
7: Convert the x-coordinate of X, x_1, to integer (x_1 ) ̅; compute v=(x_1 ) ̅ mod n
8: If v=r, return ("Accept this signature"); otherwise, return ("Reject this signature")
3. Scalar Multiplication Algorithm
In ECC, the scalar multiplication kP is the most time-consuming operation. The computation generally converts k into binary form and performs operations primarily involving point addition and doubling, as follows.
Algorithm 3: Point Addition and Doubling to Implement Scalar Multiplication
Input: Point P, a positive integer k=〖(1,k_(n-2),…,k_0)〗_2
Output: [k]P
1: R[0]←P
2: For n-2 downto 0
3: R[0]←2R[0]
4: If k_i=1 then
5: R[0]←R[0]+P
6: Return R[0]
4. Width-w NAF Algorithm
Width-w NAF is an extension that uses pre-computed points of NAF. Width-w NAF represents an n-bit integer d, d=∑_(i=0)^(n-1)〖d_w [i]2^i 〗, where d_w [i] is an odd integer and satisfies |d_w [i]|<2^(w-1), meaning that among w consecutive elements, there is at most one non-negative value.
Width-w NAF was independently proposed by different authors in [13][14][15]. The generation algorithm proposed by Solinas in [16] is very simple, and we will briefly describe it.
Algorithm 4: Traditional Width-NAF Algorithm
Input: Window width w, a positive integer d.
Output: NAF_W (d).
1: For i=0 to n
2: If d=1mod2 then
3: d_w [i]←d mods 2^w and d←d-d_w [i]
4: Else
5: d_w [i]←0
6: d←d/2
2: Return d_w [n],d_w [n-1],…,d_w [0]
In step 3, “mods 2^w” refers to the signed residue class of odd d, i.e., {{-2^(w-1)+1,…,-3,-1,1,3,…,2^(w-1)-1}. Therefore, we must pre-compute points P, 3P,…,(2^(w-1)-1)P to represent the residual sequence using pre-computed points, which has 2^(w-2) points.
If d in step 1.1 is not even, then d_w [i] is odd, and has|d_w [i]|<2^(w-1). After step 1.1, d is always divisible by 2^w.
Thus, once d_w [i]=0 holds, the next w-1 bits will all be zero, i.e., d_w [i+1]=d_w [i+2]=⋯d_w [i+1]=0.
5. Power Analysis Attacks and Hamming Weight Model
The Hamming weight refers to the number of non-zero symbols in a string of symbols. In a binary representation of a string of symbols, it is the count of 1s.
In chips, all operations are ultimately implemented by the logical state changes of semiconductors between 0 and 1. When the chip operates, the changes in logical states will consume energy at the logic gates while generating electromagnetic radiation.
According to the current semiconductor process technology (mainly referring to CMOS technology), cryptographic chips exhibit different energy consumption when processing logical states 0 and 1, generating electromagnetic radiation of varying intensities. Analysts can obtain some side-channel information related to logical 0 and logical 1 by detecting differences in energy consumption or electromagnetic signals.
Since energy consumption and electromagnetic signals usually reflect the same information, we will focus on energy consumption below.
During the execution of an algorithm, certain operation counts of interest to the attacker will typically be produced. For different operation counts, the energy consumption at the chip’s internal logic gates usually varies. This difference is attributed to several models: Hamming weight model, Hamming distance model, zero-value model, etc.
The operation counts are generally stored in binary form within the chip. For software-implemented algorithms, the transitions of the operation counts corresponding to the registers generally switch from all 0s or all 1s to the binary representation of that operation count, and the resulting energy consumption usually conforms to the Hamming weight model.
That is, we assume that the energy consumption t at a certain moment in the algorithm has a linear relationship with the Hamming weight of the intermediate value y:t=aHW(y)+b
If a is positive, the higher the Hamming weight of the operation count, the lower the energy consumption; if a is negative, the opposite holds.
Platform Setup and Preliminary Observations
1. Platform Setup
We used a Lecroy WaveRunner 104Xi-A oscilloscope, self-developed energy signal acquisition probe, Mini-Circuits 1.9MHz low-pass filter, test development board, USB to UART board, and computer to set up the experimental platform, as shown in Figure 1.

2. Preliminary Observations
We used a serial debugging assistant to continuously send data to be signed to the chip, with a cycle of 3 seconds, meaning that an ECDSA signing operation is executed every 3 seconds.
During this process, we observed changes in the waveforms on the oscilloscope. It was found that approximately 1.2 seconds of the waveform in each 3-second cycle was quite distinct, significantly different from the energy consumption waveform collected during the chip’s idle state with no instructions sent.
We triggered the serial sending signal and collected 5 seconds of waveforms at a sampling rate of 200KSa/s, resulting in the waveform shown in Figure 2.





Simple Energy Analysis of the ECDSA Chip
1. Introduction to Simple Energy Analysis
At CRYPTO 1999, Kocher et al. proposed the simple energy analysis technique [4], which is based on the assumption that different operations generally have different energy consumption waveforms. Under the current chip manufacturing processes, we believe this holds true.
2. Theoretical Analysis
2.1. Scalar Multiplication Implemented with Traditional Algorithm in Binary
Input: Point P, 256-bit integer k=k_255…k_2 k_1 k_0
Output: kP
Steps:
1: Q=0
2: For j=254 downto 0
3: Q=2Q
4: If k_j=1 then
5: Q=Q+P
6: Return Q
2.2. Scalar Multiplication Implemented with Width-w NAF Algorithm
Input: k, P, w
Output: kP
Pre-computation:
1: Convert k to width-w NAF form, represented as NAF_W (k)=∑_(i=0)^(l-1)▒〖k_i 2^i 〗.
Main computation:
1: Q←∞
2: For i=l-1 downto 1
3: Q=2Q
4: If k_i≠0 then
5: Q←Q+P
6: Return Q
2.3. Improved Width-w NAF Algorithm Design to Resist Simple Energy Analysis
Input: w,d
Output: d_w [n],d_w [n-1],…,d_w [0].
1: For i = 0 to ⌈n/w⌉
2: u[i] ←d mod 2^w
3: If u[i]= 0…0 then
4: u[i]←1 ̅…1 ̅ and d←d+2^w.
5: d←d-u[i], d←d/2
6: d_w [iw]=u[i],d_w [iw+1]=…=d_w [iw+w-1]=0.
7: Return d_w [n],d_w [n-1],…,d_w [0]
2.4. Implementation of Improved Width-w NAF Algorithm to Resist Simple Energy Analysis
3. Experimental Analysis
According to the analysis, it can be concluded that the scalar multiplication algorithm ends at approximately 0.6 seconds. We collected waveforms over 1 second at a sampling rate of 10MSa/s, resulting in the waveform shown in Figure 7.




Differential Energy Analysis of the ECDSA Chip
1. Introduction to Correlation Energy Analysis
Correlation energy analysis was proposed by Brier et al. at CHES 2004 [18]. Generally speaking, the same operations with different operands correspond to different energy levels.
In the case where the algorithm is known, if an attacker can drive the attacked device to perform encryption multiple times, and the operands involved in the computation with the key are different and known each time, the attacker can use statistical analysis methods on the collected waveforms to recover the key.
1.1. Waveform Collection for Correlation Energy Analysis
1.2. Intermediate Value Leakage Analysis Under Known Key Conditions
1.3. Key Recovery Using Correlation Energy Analysis Under Unknown Key Conditions
2. Theoretical Analysis of the Tested Chip
In this section, we analyze the code for the entire computation process of ECDSA, identifying steps that may be vulnerable to CPA analysis.
2.1. CPA Analysis of Scalar Multiplication Steps
2.2. CPA Analysis of the kP Calculation Step for Signature Value


3. Experimental Analysis
Based on the analysis, we can identify the range corresponding to the d×r analysis step in the overall waveform. We collected 20,000 waveforms for the corresponding range at a sampling rate of 10MS/s, and Figure 13 shows one of the waveforms.






Other Side-Channel Analysis Methods for the ECDSA Chip
Improvement Suggestions for the ECDSA Chip
s=k^(-1) (e+dr)modn
=k^(-1) e+k^(-1) (dr)modn
=k^(-1) e+(k^(-1) d)rmodn
Summary
– End –

Kanxue ID:LunaYoung
https://bbs.pediy.com/user-253538.htm
* This article is original by Kanxue Forum LunaYoung, please indicate that it is from Kanxue Community
The Talosec project practices the open-source spirit, aiming to enhance the security infrastructure of the blockchain world, and is also a project closely linked with the Kanxue Forum. Thanks to the participation and testing of core team members within the forum, the core design and development work of the Talosec wallet has reached a milestone, showcasing the core technical capabilities of the forum.
As security professionals, we have always been exploring the boundaries of security, which will be a mission-driven endeavor—much like the original intention behind the establishment of the Kanxue Forum. Therefore, we hope that more security personnel can participate in this project.
As a symbol of return and technical confidence, we will conduct a priority sale of the product internally within the Kanxue Forum, offering favorable sales plans.
The sale includes two rounds: the first round reserves 0x100 Genesis editions, independently numbered for lifetime free upgrades and replacements, while the second round reserves 0x400 special editions, enjoying one year of free upgrades and replacements (with a one-year warranty for the official version). The pre-sale price is 256 yuan (official retail price is 616 yuan)..
Additionally, we offer other benefits, which you can learn about in the Talosec project commercial white paper (Chapter 3). The technical white paper of the project will be provided later.
Official website: https://talosec.io/
Scan the QR code
Go buy now
Talosec Pre-sale

Lottery Zone
、
Congratulations Jimu Chutianshu for winning!!
Please send the name of the book and recipient information (recipient, phone number, shipping address) to the WeChat public account backend as soon as possible
Note: If the winning information is not sent within a week after winning, it will be regarded as an automatic forfeiture.
Recommended Articles++++

* Building Your Own PE Interpreter
* HW Action rdpscan Backdoor Simple Analysis
* CVE-2018-0802 Personal Analysis
* Basic Data Type Representations in C++
* Learning Linux Kernel Exploit Vulnerabilities (3)- Bypass-Smep
﹀
﹀
﹀
↙Click the “Read the Original” below to view the original text