High-Speed Programming with Jlink: A Guide for Embedded Engineers

Once upon a time…

We were still immersed in the comfort of using pirated J-Link, unaware that the crackdown would come suddenly.

It wasn’t until one day that SEGGER intensified its crackdown on pirated J-Link, and Taobao began to remove listings for JLINK.

With the removal of pirated JLINK, I opened SEGGER’s website for the first time,

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Seeing the price of the genuine Jlink, I realized the feeling of being “choked” on key technologies is so “burning” and painful.

The price of the genuine product is already so high that we ordinary people can only look up at it. You keep yours, we change ours! Let’s call it “ARM Emulator” and continue selling!

High-Speed Programming with Jlink: A Guide for Embedded Engineers

However, the quality of the commonly seen “ARM emulators” on the market is truly hard to describe. At least seven or eight of them have broken in my hands!

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Moreover, SEGGER has introduced detection mechanisms in J-Link firmware and software updates through technical blockades to prevent pirated devices from using new features. Currently, the firmware of pirated JLINK emulators only supports up to version 4.40. If a driver version higher than 4.40 is installed and upgraded, it will cause the Jlink to become unusable. Many users have surely suffered from the errors caused by pirated JLINK!

  1. Piracy detection: the connected probe appears to be a j-link clone
High-Speed Programming with Jlink: A Guide for Embedded Engineers
When using a non-standard version of JLink to connect to a high version of MDK, along with a JLink driver version that is too high, this issue will be detected.
  1. Connection failure: The connected J-Link is defective

    High-Speed Programming with Jlink: A Guide for Embedded Engineers

The reason for the issue is that the J_Link driver of keil5 is the latest version, while the J_Link firmware in our hands is not the latest version.

As a hardworking embedded coder, I am already exhausted from fixing bugs daily, and I have to deal with tools that often “break down,” it’s truly unbearable!

If I could make a dedicated programmer myself, ARMmbed’s open-source DAPLink, which has been around for many years, is definitely the best choice. After all, as the saying goes, standing on the shoulders of giants allows you to see further and walk more steadily!

Introduction to CMSIS-DAP In the early debugging of STM32, we generally used ST’s official STLink for program burning and debugging. Later, ARM developed a new emerging project called CMSIS-DAP. CMSIS-DAP is an ARM open-source debugging project that supports all Cortex-A/R/M components, using HID connection, allowing for driver-free connection to the computer for debugging.The understanding of CMSIS-DAP can be divided into two parts, CMSIS and DAP.CMSIS stands for ARM Cortex-M Software Interface Standard, which is a software interface standard for ARM; DAP stands for Debug Access Port, which is a software debugging access port. CMSIS-DAP can be understood as a software debugging interface developed by ARM, which can connect to ARM’s Cortex-A/R/M series components using JATG or SWD connection methods.

Porting DAPLINK can be divided into two steps. The first step is to port the online download debugging part of CMSIS-DAP. The latest code location from ARM is in the MDK installation path ...\AppData\Local\Arm\\Packs\\ARM\\CMSIS\5.9.0\CMSIS\DAP\Firmware\Source, the current latest version is V2.1.1. The process of porting DAP is actually just the process of interfacing the CMSIS-DAP protocol to the USB protocol stack, the files are as follows:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Improve download speed

DAPLink, as an ARM open-source debugging and programming tool, provides extensive support and ease of use. However, due to the low efficiency of CMSIS-DAP protocol transmission, inefficient USB HID protocol transmission, small data block size, low SWD/JTAG clock frequency, and hardware and firmware not being specially optimized for speed. Although DAPLink has good universality and compatibility, it is inferior in performance compared to professional JLINK debugging tools.

Since we are going to do it, we must do it best! To make up for the performance shortcomings of DAPLINK, in terms of hardware, we adopted the high-performance HPM5301 chip from Xianji Semiconductor, which has a main frequency of up to 480MHz and a built-in PHY high-speed USB interface; in terms of software, we replaced the USB protocol with the faster CherryUSB protocol stack and deeply optimized the data processing and communication code in the DAPLink firmware, reducing internal delays and waiting times, and increasing the SWD clock speed to 10MHz.

Special thanks to the author of the CherryUSB protocol stack sakumisu, and the great RCSN from Xianji Semiconductor for their strong technical support. The open-source address for CherryDAP is: https://github.com/cherry-embedded/CherryDAP

The optimized clock rates are as follows:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Compared with the latest J-LINK-V12 speed on the market, the target chip uses STM32H743, and the development environment MDK V5.39, using MicroLink and Jlink V12 to download a 2558KB HEX file to the internal FLASH. Using a logic analyzer to test the clock pin, calculate the time for the entire process of erasing, programming, and verification, the time used by my homemade downloader is 24.205 seconds, while Jlink V12 takes 33.439 seconds. The test data is shown below:

Jlink V12 test results:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

MicroLink test results:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Comparison of test results:
Debugger Total time (Erase, Program, Verify)
MicroLink 24.205 seconds
J-LINK V12 33.439 seconds

From the comparison, it can be found that the download speed has exceeded the latest JLINKV12.

Optimize virtual serial port

DAPLINK also supports a virtual serial port, but the function of the virtual serial port is actually not closely related to DAPLINK. The virtual serial port uses the CDC class device of the USB protocol stack, which is one of the device classes defined by USB, allowing communication devices to transmit data through the USB interface.

With the download speed improved, the speed of USB to serial naturally cannot fall behind. The baud rate speed of a typical USB to serial supports up to 2M, which is already impressive, but who made me use the strongest domestic MCU from Xianji, directly maximizing the serial port performance to support a maximum baud rate of 10M with no packet loss.

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Using a logic analyzer to capture the waveform as shown, you can see that the time for each bit transmission is 1/10M=100ns.

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Enhance drag and drop download

The above is the basic online download debugging function of porting DAPLINK. The second step is to port the offline download function of DAPLINK. The implementation of the offline programmer mainly relies on the DAP connection protocol in CMSIS-DAP. This part has already been written by ARM officials for us. The DAP offline download mainly consists of several processes: initializing DAP-> connecting DAP to the chip-> confirming the connection method-> clearing the target board read protection (can be ignored)-> erasing the target board Flash-> burning the program-> resetting to run. The connection method includes SWD interface and JTAG interface, and we need to port the official code of DAPLink/source/daplink/interface/swd_host.c.

SWD (Serial Wire Debug) interface is a two-wire debugging protocol designed by ARM for Cortex-M series processors, aimed at providing low pin usage and high-efficiency debugging functions for embedded systems. It is an alternative to the JTAG (Joint Test Action Group) interface, simplifying hardware design and reducing pin usage while retaining efficient debugging capabilities.

To achieve offline downloading, after connecting to the chip, the FLASH burning algorithm for the target chip must also be provided to perform the download. Theoretically, we need to provide the corresponding burning algorithm for each chip we want to support. Fortunately, the MDK installation path provides us with such algorithm files. For example, the download algorithm directory for the STM32F1 series is located at ...\\Arm\\Packs\\Keil\\STM32F1xx_DFP\\2.0.0\\Flash, and the STM32F10x folder contains the source code for the download algorithm, while the FLM file is the download algorithm.

High-Speed Programming with Jlink: A Guide for Embedded Engineers

The FLM file is essentially an ELF format file, and Keil specifies the composition of the FLM file, which is unchanging and includes mandatory flash programming functions Init, UnInit, EraseSector, and ProgramPage. Depending on the device characteristics, functions like EraseChip, BlankCheck, and Verify can be implemented.

Let’s analyze the existing FLM file, taking STM32F4xx_1024.FLM as an example.

Open the command line tool and enter arm-none-eabi-readelf -a STM32F4xx_1024.FLM:

$ arm-none-eabi-readelf -a STM32F4xx_1024.FLM
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          12172 (bytes into file)
  Start of section headers:          12236 (bytes into file)
  Flags:                             0x5000000, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         2
  Size of section headers:           40 (bytes)
  Number of section headers:         16
  Section header string table index: 15

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] PrgCode           PROGBITS        00000000 000034 000144 00  AX  0   0  4
  [ 2] PrgData           PROGBITS        00000144 000178 000004 00  WA  0   0  4
  [ 3] DevDscr           PROGBITS        00000148 00017c 0010a0 00   A  0   0  4
  [ 4] .debug_abbrev     PROGBITS        00000000 00121c 0005a4 00      0   0  1
  [ 5] .debug_frame      PROGBITS        00000000 0017c0 000104 00      0   0  1
  [ 6] .debug_info       PROGBITS        00000000 0018c4 00064c 00      0   0  1
  [ 7] .debug_line       PROGBITS        00000000 001f10 000218 00      0   0  1
  [ 8] .debug_loc        PROGBITS        00000000 002128 0001b8 00      0   0  1
  [ 9] .debug_macinfo    PROGBITS        00000000 0022e0 000614 00      0   0  1
  [10] .debug_pubnames   PROGBITS        00000000 0028f4 000096 00      0   0  1
  [11] .symtab           SYMTAB          00000000 00298c 000110 10     12   9  4
  [12] .strtab           STRTAB          00000000 002a9c 000100 00      0   0  1
  [13] .note             NOTE            00000000 002b9c 00001c 00      0   0  4
  [14] .comment          PROGBITS        00000000 002bb8 000334 00      0   0  1
  [15] .shstrtab         STRTAB          00000000 002eec 0000a0 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  y (purecode), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000034 0x00000000 0x00000000 0x00148 0x00148 RWE 0x4
  LOAD           0x00017c 0x00000148 0x00000148 0x010a0 0x010a0 R   0x4

 Section to Segment mapping:
  Segment Sections...
   00     PrgCode PrgData
   01     DevDscr

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

Symbol table '.symtab' contains 17 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $t
     2: 00000122     0 NOTYPE  LOCAL  DEFAULT    1 $d
     3: 00000144     0 NOTYPE  LOCAL  DEFAULT    2 $d.realdata
     4: 00000148     0 NOTYPE  LOCAL  DEFAULT    3 $d.realdata
     5: 00000000     0 FILE    LOCAL  DEFAULT  ABS FlashPrg.c
     6: 00000000     0 SECTION LOCAL  DEFAULT    1 .text
     7: 00000000     0 FILE    LOCAL  DEFAULT  ABS FlashDev.c
     8: 00000148  4256 SECTION LOCAL  DEFAULT    3 .constdata
     9: 00000000     0 NOTYPE  GLOBAL HIDDEN   ABS BuildAttributes$$THM_ISAv
    10: 00000001    28 FUNC    GLOBAL HIDDEN     1 GetSecNum
    11: 0000001d    46 FUNC    GLOBAL HIDDEN     1 Init
    12: 0000004b    14 FUNC    GLOBAL HIDDEN     1 UnInit
    13: 00000059    44 FUNC    GLOBAL HIDDEN     1 EraseChip
    14: 00000085    76 FUNC    GLOBAL HIDDEN     1 EraseSector
    15: 000000d1    82 FUNC    GLOBAL HIDDEN     1 ProgramPage
    16: 00000148  4256 OBJECT  GLOBAL HIDDEN     3 FlashDevice

No version information found in this file.

Displaying notes found at file offset 0x00002b9c with length 0x0000001c:
  Owner                 Data size       Description
  ARM                  0x0000000c       Unknown note type: (0x40000000)

Through the Symbol table information, we can find the positions of the Init, UnInit, EraseSector, and ProgramPage functions.

There is one most important step left, where do we get the files to be burned?

There are two ways to obtain the files:
  • Place the file to be burned in external flash and read it through the file system (offline download)
  • Simulate a USB flash drive, drag the burning file to the USB flash drive, and directly forward it in packets (drag and drop download)

The two methods of obtaining files correspond to two ways of offline downloading, and the DAPLINK source code provides the drag and drop download method. However, the downside is that the official DAPLINK drag and drop burning only targets a specific model of MCU. To support other MCUs, the firmware of the debugger must be manually updated, which raises the threshold for users and makes such a convenient function less appealing.

As a downloader, if it can only perform offline downloads for one chip, it is too unqualified. Therefore, I have made adaptations for a large number of Cortex-M series chips, including STM32 from STMicroelectronics and GD32 from GigaDevice, and I am continuously adding support for other models.

The USB drag and drop download supports HEX and BIN files. HEX files carry address information and automatically select the burning position based on the addresses in HEX, while BIN files default to downloading at address 0x08000000. The following demonstration video shows copying a HEX file to the USB flash drive to complete firmware downloading:

Seeing this, the homemade downloader can already meet my daily needs, but do you think it ends here?

Built-in ymodem protocol

As a coder who has been moving bricks for nearly ten years, I know that not every device will reserve SWD or JTAG interfaces after installation, and there are definitely some lingering bugs in factory products waiting to be fixed.

So can we provide a stable and reliable BootLoader program embedded in the product, so that once an upgrade is needed, customers can simply drag the upgrade file onto the virtual USB drive through the reserved serial port,485, or other communication interfaces, and automatically complete the firmware upgrade? Just like this: the video shows copying a bin file to the USB flash drive to complete the transmission of the upgrade file

To achieve such functionality, I built the ymodem file transfer protocol into this downloader. The Ymodem protocol maintains data integrity during multiple retransmissions, making it very suitable for firmware updates in embedded systems.

YModem protocol is a serial communication file transfer protocol, improved based on the earlier XModem protocol, adding support for batch transfer of multiple files and providing metadata information such as file name, size, and modification date during transmission. It usesCRC (Cyclic Redundancy Check) for error detection, offering higher transmission efficiency compared to XModem and supporting a data block size of 1KB (YModem-1K), thus speeding up the transfer of large files.

The data flow for offline downloading files using the ymodem protocol is as follows:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

To use the built-in ymodem protocol to send files, merely having the downloader is not enough; the target device must also support receiving files via the ymodem protocol. If one is to do good, they must do it thoroughly. I have created a very stable and reliable BootLoader open-source code framework:

MicorLink Introduction:

https://microboot.readthedocs.io/zh-cn/latest/tools/microlink/microlink/

MicorBoot Introduction:

https://microboot.readthedocs.io/zh-cn/latest/

MicorBoot Open-source Code:

https://github.com/Aladdin-Wang/MicroBoot

At this point, the functionality of this downloader has been basically completed, but it is far from over…

Feature highlights

[√] Supports SWD/JTAG interface, download and debug speed surpassing JLINK V12 (clock 10Mhz)

[] Supports downloading and debugging ARM/RISC-V chips using OpenOCD IDE

[] Supports USB to serial with a maximum baud rate of 10M with no packet loss

[] Supports USB flash drive drag and drop download for a large number of Cortex-M series chips, with built-in numerous download algorithms for automatic chip recognition

[] Built-in ymodem protocol stack, automatically triggers ymodem to transfer files to the target device via serial port when files are dragged onto the USB flash drive (requires cooperation with a bootloader that supports ymodem protocol)

[] Supports system firmware upgrades, allowing for the addition of more features in the future

[] Uses winusb for driver-free plug-and-play on Windows 10

[] Supports 3V3/5V high current output power supply

[] Built-in reverse current and overcurrent protection, preventing external current from flowing back into the USB port and damaging it

[ ] Supports reading target chip firmware via USB flash drive

[ ] Supports reading any files from the target chip via USB flash drive

[ ] Supports offline downloading for Cortex-M series chips, automatically recognizing target chips and triggering downloads

[ ] Supports drag and drop downloading for RISC-V series chips, with built-in numerous download algorithms for automatic chip recognition

[ ] Supports offline downloading for RISC-V series chips, automatically recognizing target chips and triggering downloads

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Combining the above features provides developers with a one-stop solution for downloading, debugging, mass production, after-sales maintenance, and firmware upgrades.

Product Purchase Link:https://item.taobao.com/item.htm?ft=t&id=826800975011

Follow the public account:

Scan the code to join the embedded exchange group:

High-Speed Programming with Jlink: A Guide for Embedded Engineers

Leave a Comment

×