Basic Steps for CANFD Debugging

CAN (Controller Area Network) is a serial communication network that effectively supports distributed control or real-time control. The CAN bus is a widely adopted bus protocol in automotive applications, designed for microcontroller communication in automotive environments. As company projects gradually shift towards the automotive sector, the scenarios for using CAN are increasing. This article analyzes CAN debugging techniques, which can help troubleshoot issues with the CAN bus.

1: CAN Drivers

drivers/net/can/rockchip/rockchip_can.c
drivers/net/can/rockchip/rockchip_canfd.c

The RK3588 has two CAN drivers, which can be freely selected through the device tree. By default, CANFD communication is selectable, and the device tree is configured as follows:

can0: can@fea50000 {
 compatible = "rockchip,can-2.0";
 reg = <0x0 0xfea50000 0x0 0x1000>;
 interrupts = <0 341 4>;
 clocks = <&cru 112>, <&cru 111>;
 clock-names = "baudclk", "apb_pclk";
 resets = <&cru 185>, <&cru 184>;
 reset-names = "can", "can-apb";
 pinctrl-names = "default";
 pinctrl-0 = <&can0m0_pins>;
 tx-fifo-depth = <1>;
 rx-fifo-depth = <6>;
 status = "disabled";
};
&can0 {
 assigned-clocks = <&cru 112>;
 assigned-clock-rates = <200000000>;
 status = "okay";
};

The actual CAN bus can adjust its clock frequency by default, with a default of 200M. If there are issues with the frequency division, the clock source can be adjusted to other values to improve the sampling point of the CAN bus.

2: Commands

2.1 Check CAN Status

ip -details -statistics link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
 link/can promiscuity 0 minmtu 0 maxmtu 0
 can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 1
 bitrate 250000 sample-point 0.868
tq 40 prop-seg 42 phase-seg1 43 phase-seg2 13 sjw 1
 rockchip_canfd: tseg1 1..128 tseg2 1..128 sjw 1..128 brp 1..256 brp-inc 2
 clock 99000000
 re-started bus-errors arbit-lost error-warn error-pass bus-off
 12020 0 0 12132 3438 12450 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
 RX: bytes packets errors dropped overrun mcast
 320320 40040 0 0 0 0
 TX: bytes packets errors dropped carrier collsns
 0 0 0 12020 0 0

Here are some key pieces of information to focus on:

a. state - CAN bus state
b. bitrate - Bitrate
c. sample-point - Sampling point
d. TX, RX - Packet status

2.2 Set CAN Sampling Point

ip link set can0 type can tq 125 prop-seg 6  phase-seg1 7 phase-seg2 2 sjw 1

By setting the parameters above, you can modify the CAN sampling point, or you can directly set the bitrate to automatically calculate the sampling point.

ip link set can0 type can bitrate 250000

2.3 Start CAN Test

ip link set can0 down
ip link set can0 up type can bitrate 250000
ip link set can0 up
while true
do
    cansend can0 00000000#0000000000000000
done

The above commands can directly bring up the can0 device and send default data.

2.4 Start CAN Loopback Test

ip link set can0 down
ip link set can0 up type can bitrate 250000 loopback on
ip link set can0 up
while true
do
    cansend can0 00000000#0000000000000000
done

This can directly test the CAN loopback.

2.5 Receive and Send Information

# Receive CAN information
candump can0 > can_recv.log &

# Send CAN information
cansend can0 123#1122334455
cansend can0 5A1#11.2233.44556677.88
cansend can0 1F334455#1122334455667788

2.6 Debug CAN Data

Since CAN data is reported through net skb, parsing skb data can determine the received and sent CAN data information. The following function can be used:

static void skb_dump(struct sk_buff *skb, struct net_device *dev){
 int i = 0;
if(!skb || !dev){
 pr_err("%s: bad param\n",__FUNCTION__);
 goto out;
 }
 printk(KERN_CONT "Kylin:Can packets(len=%d): ",skb->len);
 for (i = 0; i < skb->len; i++){
 printk(KERN_CONT "%02x", *(skb->data + i));
 }
 printk(KERN_CONT "\n");
out:
 return;
}

Then simply add the skb_dump function to the processing of received and sent data.

2.7 Determine CAN Error Frames

For received CAN interrupts, the main interrupts are as follows:

#define RX_FINISH_INT BIT(0)
#define TX_FINISH_INT BIT(1)
#define ERR_WARN_INT BIT(2)
#define RX_BUF_OV_INT BIT(3)
#define PASSIVE_ERR_INT BIT(4)
#define TX_LOSTARB_INT BIT(5)
#define BUS_ERR_INT BIT(6)
#define RX_FIFO_FULL_INT BIT(7)
#define RX_FIFO_OV_INT BIT(8)
#define BUS_OFF_INT BIT(9)
#define BUS_OFF_RECOVERY_INT BIT(10)
#define TSC_OV_INT BIT(11)
#define TXE_FIFO_OV_INT BIT(12)
#define TXE_FIFO_FULL_INT BIT(13)
#define WAKEUP_INT BIT(14)

Regarding BUS_ERR errors, the following situations may occur:

#define BIT_ERR 0
#define STUFF_ERR 1
#define FORM_ERR 2
#define ACK_ERR 3
#define CRC_ERR 4

Corresponding situations can be debugged accordingly.

2.8 Logs

For netdev_dbg logs, the following steps can be taken:

a. #define DEBUG  Define inside the driver
b. Set log level to 8

Other <span>netdev_*</span> logs can be viewed by directly setting the log level to 8, such as:

dmesg -n8

2.9 Interrupt Context

For CAN issues, first determine the interrupts:

cat /proc/interrupts | grep can

Check if hardware interrupts are normal.

Then determine soft interrupts:

cat /proc/softirqs
 CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
 HI: 2834 1 1 3 0 0 0 0
 TIMER: 7791 33602 36340 57451 22095 6063 31674 5003
 NET_TX: 2596 9 7 2 1 3870 2639 3561
 NET_RX: 33128 100 36 307 141 3876 2662 21700
 BLOCK: 7154 17766 14324 14861 13713 2332 2492 1812
 IRQ_POLL: 0 0 0 0 0 0 0 0
 TASKLET: 160 2 1 2 1 1 75 3
 SCHED: 85621 59461 52573 172213 82530 16771 32225 13479
 HRTIMER: 0 0 0 0 0 0 0 0
 RCU: 77962 44178 37468 71517 57343 36644 33600 27705

Determine if NET_RX is normal; if not, the netif_rx call in the CAN driver may be too frequent.

2.10 Improve Real-Time Response of CAN Interrupts

1) Remove all: cpu-idle-states = <&CPU_SLEEP>; from dts
2) Set to performance mode: echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
3) Bind the CAN interrupt to a specific CPU: irq_set_affinity_hint(ndev->irq, get_cpu_mask(num_online_cpus() - 1));

For more articles, please follow the public account: Kylin Embedded.

Leave a Comment