I noticed in a kernel 4.0.7 that I loose CAN packets when an incoming rsync transfer changes my eMMC or SD-Card image. I used CONFIG_FRACE to find why this is the case, and came to this trace:
# tracer: preemptoff
# preemptoff latency trace v1.1.5 on 4.0.7
# latency: 1046 us, #756/756, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
# | task: mmcqd/0-76 (uid:0 nice:0 policy:0 rt_prio:0)
# => started at: sdhci_do_set_ios
# => ended at: sdhci_do_set_ios
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
mmcqd/0-76 2d..1 1us : _raw_spin_lock_irq <-sdhci_do_set_ios
mmcqd/0-76 2d..2 2us : esdhc_writeb_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 4us : esdhc_pltfm_set_bus_width <-sdhci_do_set_ios
mmcqd/0-76 2d..2 5us : l2c210_sync <-esdhc_pltfm_set_bus_width
mmcqd/0-76 2d..2 7us : esdhc_writeb_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 9us : l2c210_sync <-esdhc_writeb_le
mmcqd/0-76 2d..2 10us : esdhc_readw_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 12us : esdhc_writew_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 14us : l2c210_sync <-esdhc_writew_le
mmcqd/0-76 2d..2 16us : l2c210_sync <-esdhc_writew_le
mmcqd/0-76 2d..2 18us : esdhc_readw_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 19us : esdhc_writew_le <-sdhci_do_set_ios
mmcqd/0-76 2d..2 21us : l2c210_sync <-esdhc_writew_le
mmcqd/0-76 2d..2 22us : esdhc_set_uhs_signaling <-sdhci_do_set_ios
mmcqd/0-76 2d..2 24us : pinctrl_select_state <-esdhc_set_uhs_signaling
mmcqd/0-76 2d..2 26us : esdhc_readl_le <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 28us : esdhc_writel_le <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 29us : l2c210_sync <-esdhc_writel_le
mmcqd/0-76 2d..2 31us : esdhc_readl_le <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 33us : esdhc_writel_le <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 34us : l2c210_sync <-esdhc_writel_le
mmcqd/0-76 2d..2 36us : l2c210_sync <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 37us : __timer_const_udelay <-esdhc_pltfm_set_clock
mmcqd/0-76 2d..2 39us : __timer_delay <-__timer_const_udelay
mmcqd/0-76 2d..2 40us : read_current_timer <-__timer_delay
mmcqd/0-76 2d..2 41us : imx_read_current_timer <-read_current_timer
mmcqd/0-76 2d..2 43us : read_current_timer <-__timer_delay
mmcqd/0-76 2d..2 44us : imx_read_current_timer <-read_current_timer
mmcqd/0-76 2d..2 45us : read_current_timer <-__timer_delay
So it seems that esdhc_pltfm_set_clock() somehow waits. And really there is an mdelay() there. What's worse: I put a printk() into this function, just before the mdelay(). And it get's called hundreds of times when I boot, or when I update via rsync. And for each call we have an mdelay() that kills preemption.
This is bad because in the FlexCAN driver the ISR calls napi_schedule(), which in turn clears the CAN RxFIFO. And if this scheduling is taking too long, CAN packets are dropped.
The mdelay() is also bad because it makes eMMC/SD-Card accesses WAY slower than they could be.
Thanks for the detailed analysis.
I tried using usleep_range(1000, 1500) instead of mdelay and then I got "inconsistent lock state" warnings.
Just curious: what happens if the mdelay() is removed. I haven't gone through the git log to understand why it was placed there.
Also, it would be nice to report this into linux-mmc and linux-arm-kernel lists.
The mdelay(1) is taken when sdhci_set_ios() much higher took a lock (spin_lock_irq). Maybe this prevents the ranged delay? Unfortunately, I'm a mediocre kernel hacker: I have patches in the kernel, but I'm usually at loss when it comes to locks.
> Just curious: what happens if the mdelay() is removed.
Wild guess: when I removed CONFIG_MMC_CLKGATE this mdelay() was no longer in the hot-path. Seems that currently, when clock gating is on, for every sector (or MMC operation) this happens:
* clock is turned on
* clock is turned off
And in the on-case there is an mdelay(1). I hate when people put an mdelay() into the code without an explanation, e.g. why it's where, any why it's this time. My guess however is, that after turning on the clock, the programmer wanted to make sure that the clock became stable, e.g. that a PLL locked. If that is 1ms in the worst case I cannot say.
Generally such things should be written in the style of a state-machine. That would allow the code to relinquish the code (to "sleep") back to other processes. And this in turn would not kill latency and cause hickups in the form of lost CAN frames at other places.
Alternatively/Additionally, it would have been swell if the FlexCAN IP would have more than 6 packets in the RxFIFO, e.g. unused mailbox slots. But we cannot change the IP ... at 500 kB/s a latency of more than 500 uS can already cause lost CAN packets.
Nothing worse than loose packets. As long as they weren't getting lost...
> at 500 kB/s a latency of more than 500 uS can already cause lost CAN packets.
At 1MHz even more-so.
The problem is whoever thought it would be a good idea to empty a 6 MB FIFO at NAPI level. While the FEC driver didn't! I rewrote my 3.4 driver to empty during the interrupt, and no more problems.
The Freescale driver for i.MX53 (on 2.6.35) uses 32 receive MBs and 32 transmit MBs. It handles them all in interrupts, so no NAPI delays at all.
Unfortunately it sends packets in "random order" and receives them in "random order" too. The Reference Manual says to fix the receive problem by sorting the packets in timestamp order, but the code doesn't do that. However the FIFO works fine under interrupts, until it gets out of sync because the driver handles interrupts the wrong way. I'm about to document that.
There's been some recent discussion on the linux-can list about FlexCAN drivers allowing reading using interrupts and using "message pooling", so a newer driver may help you.
The other take-home message is to not use an MMC card or eMMC chip while the supplied drivers do things like "mdelay()". That should be a major item in memory selection and in CPU selection.
Why, doesn't the SDHCI driver from your BSP not contain the mdelay's ???
Generally, I don't like vendor-BSPs. Some of the code I see from vendors is quite hacky. I expect any vendor to submit their patches upstream. Yes, code-review can be harsh. Or time-consuming. But the end-result is usually much better. I yet have to see any vendor where the "private driver" is better than what is upstream.
Currently I have my own patches to the relatively new kernels in form of a "quilt" managed stack. AFAIK Freescale's BSPs use outdated kernels, so applying them would need a considerable effort in work (both software updates, but also more testing).
If Freescale would guarantee me that their BSP is free of preemption issues, or even CONFIG_PREEMPT_RT able *), then I'd do that.
*) I noticed that CONFIG_PREEMPT_RT fixes all my lost CAN frame issues, but it is quite unstable on Linux 3.18.17 (-rt17) and also on 4.0.4 (-rt4) when run on the i.MX6. I even had eMMC image corruptions.