mdelay() in hot path in esdhc_pltfm_set_clock looses CAN (!) frames

holgerschurig · ‎06-30-2015

Hi people,

I noticed in a kernel 4.0.7 that I loose CAN packets when an incoming rsync transfer changes my eMMC or SD-Card image. I used CONFIG_FRACE to find why this is the case, and came to this trace:

# tracer: preemptoff

#

# preemptoff latency trace v1.1.5 on 4.0.7

# --------------------------------------------------------------------

# latency: 1046 us, #756/756, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)

# -----------------

# | task: mmcqd/0-76 (uid:0 nice:0 policy:0 rt_prio:0)

# -----------------

# => started at: sdhci_do_set_ios

# => ended at: sdhci_do_set_ios

#

# _------=> CPU#

# / _-----=> irqs-off

# | / _----=> need-resched

# || / _---=> hardirq/softirq

# ||| / _--=> preempt-depth

# |||| / delay

# cmd pid ||||| time | caller

# \ / ||||| \ | /

mmcqd/0-76 2d..1 1us : _raw_spin_lock_irq <-sdhci_do_set_ios

mmcqd/0-76 2d..2 2us : esdhc_writeb_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 4us : esdhc_pltfm_set_bus_width <-sdhci_do_set_ios

mmcqd/0-76 2d..2 5us : l2c210_sync <-esdhc_pltfm_set_bus_width

mmcqd/0-76 2d..2 7us : esdhc_writeb_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 9us : l2c210_sync <-esdhc_writeb_le

mmcqd/0-76 2d..2 10us : esdhc_readw_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 12us : esdhc_writew_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 14us : l2c210_sync <-esdhc_writew_le

mmcqd/0-76 2d..2 16us : l2c210_sync <-esdhc_writew_le

mmcqd/0-76 2d..2 18us : esdhc_readw_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 19us : esdhc_writew_le <-sdhci_do_set_ios

mmcqd/0-76 2d..2 21us : l2c210_sync <-esdhc_writew_le

mmcqd/0-76 2d..2 22us : esdhc_set_uhs_signaling <-sdhci_do_set_ios

mmcqd/0-76 2d..2 24us : pinctrl_select_state <-esdhc_set_uhs_signaling

mmcqd/0-76 2d..2 26us : esdhc_readl_le <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 28us : esdhc_writel_le <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 29us : l2c210_sync <-esdhc_writel_le

mmcqd/0-76 2d..2 31us : esdhc_readl_le <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 33us : esdhc_writel_le <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 34us : l2c210_sync <-esdhc_writel_le

mmcqd/0-76 2d..2 36us : l2c210_sync <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 37us : __timer_const_udelay <-esdhc_pltfm_set_clock

mmcqd/0-76 2d..2 39us : __timer_delay <-__timer_const_udelay

mmcqd/0-76 2d..2 40us : read_current_timer <-__timer_delay

mmcqd/0-76 2d..2 41us : imx_read_current_timer <-read_current_timer

mmcqd/0-76 2d..2 43us : read_current_timer <-__timer_delay

mmcqd/0-76 2d..2 44us : imx_read_current_timer <-read_current_timer

mmcqd/0-76 2d..2 45us : read_current_timer <-__timer_delay

....

So it seems that esdhc_pltfm_set_clock() somehow waits. And really there is an mdelay() there. What's worse: I put a printk() into this function, just before the mdelay(). And it get's called hundreds of times when I boot, or when I update via rsync. And for each call we have an mdelay() that kills preemption.

This is bad because in the FlexCAN driver the ISR calls napi_schedule(), which in turn clears the CAN RxFIFO. And if this scheduling is taking too long, CAN packets are dropped.

The mdelay() is also bad because it makes eMMC/SD-Card accesses WAY slower than they could be.

fabio_estevam · ‎07-07-2015

Hi Holger,

Thanks for the detailed analysis.

I tried using usleep_range(1000, 1500) instead of mdelay and then I got "inconsistent lock state" warnings.

Just curious: what happens if the mdelay() is removed. I haven't gone through the git log to understand why it was placed there.

Also, it would be nice to report this into linux-mmc and linux-arm-kernel lists.

Thanks,

Fabio Estevam

holgerschurig · ‎07-08-2015

Hi Fabia,

The mdelay(1) is taken when sdhci_set_ios() much higher took a lock (spin_lock_irq). Maybe this prevents the ranged delay? Unfortunately, I'm a mediocre kernel hacker: I have patches in the kernel, but I'm usually at loss when it comes to locks.

> Just curious: what happens if the mdelay() is removed.

Wild guess: when I removed CONFIG_MMC_CLKGATE this mdelay() was no longer in the hot-path. Seems that currently, when clock gating is on, for every sector (or MMC operation) this happens:

* clock is turned on

* operation

* clock is turned off

And in the on-case there is an mdelay(1). I hate when people put an mdelay() into the code without an explanation, e.g. why it's where, any why it's this time. My guess however is, that after turning on the clock, the programmer wanted to make sure that the clock became stable, e.g. that a PLL locked. If that is 1ms in the worst case I cannot say.

Generally such things should be written in the style of a state-machine. That would allow the code to relinquish the code (to "sleep") back to other processes. And this in turn would not kill latency and cause hickups in the form of lost CAN frames at other places.

Alternatively/Additionally, it would have been swell if the FlexCAN IP would have more than 6 packets in the RxFIFO, e.g. unused mailbox slots. But we cannot change the IP ... at 500 kB/s a latency of more than 500 uS can already cause lost CAN packets.

TomE · ‎11-30-2015

Nothing worse than loose packets. As long as they weren't getting lost...

> at 500 kB/s a latency of more than 500 uS can already cause lost CAN packets.

At 1MHz even more-so.

The problem is whoever thought it would be a good idea to empty a 6 MB FIFO at NAPI level. While the FEC driver didn't! I rewrote my 3.4 driver to empty during the interrupt, and no more problems.

The Freescale driver for i.MX53 (on 2.6.35) uses 32 receive MBs and 32 transmit MBs. It handles them all in interrupts, so no NAPI delays at all.

Unfortunately it sends packets in "random order" and receives them in "random order" too. The Reference Manual says to fix the receive problem by sorting the packets in timestamp order, but the code doesn't do that. However the FIFO works fine under interrupts, until it gets out of sync because the driver handles interrupts the wrong way. I'm about to document that.

There's been some recent discussion on the linux-can list about FlexCAN drivers allowing reading using interrupts and using "message pooling", so a newer driver may help you.

The other take-home message is to not use an MMC card or eMMC chip while the supplied drivers do things like "mdelay()". That should be a major item in memory selection and in CPU selection.

Tom

fabio_estevam · ‎07-07-2015

Just saw your post in linux-mmc :-)

alejandrolozan1 · ‎07-07-2015

Hi,

Are you getting the same behavior using our BSP? Could you try with one of our official versions of Linux?

Best Regards,

Alejandro

holgerschurig · ‎07-08-2015

Why, doesn't the SDHCI driver from your BSP not contain the mdelay's ???

Generally, I don't like vendor-BSPs. Some of the code I see from vendors is quite hacky. I expect any vendor to submit their patches upstream. Yes, code-review can be harsh. Or time-consuming. But the end-result is usually much better. I yet have to see any vendor where the "private driver" is better than what is upstream.

Currently I have my own patches to the relatively new kernels in form of a "quilt" managed stack. AFAIK Freescale's BSPs use outdated kernels, so applying them would need a considerable effort in work (both software updates, but also more testing).

If Freescale would guarantee me that their BSP is free of preemption issues, or even CONFIG_PREEMPT_RT able *), then I'd do that.

*) I noticed that CONFIG_PREEMPT_RT fixes all my lost CAN frame issues, but it is quite unstable on Linux 3.18.17 (-rt17) and also on 4.0.4 (-rt4) when run on the i.MX6. I even had eMMC image corruptions.

mdelay() in hot path in esdhc_pltfm_set_clock looses CAN (!) frames

mdelay() in hot path in esdhc_pltfm_set_clock looses CAN (!) frames

i.MX6Quad