eMMC message: "mmc2: running CQE recovery"

richard_hu · ‎05-15-2020

Hello, NXP support team:

The platform is "i.mx8mm".

When we write data to eMMC, there is message " mmc2: running CQE recovery" from kernel.

For example:
"dd" random data to eMMC

dd if=/dev/urandom of=/sd_test bs=1M count=500
[  199.947978] mmc2: running CQE recovery
[  199.953962] mmc2: running CQE recovery
[  199.959086] mmc2: running CQE recovery
[  200.171864] mmc2: running CQE recovery

I check the message in “drivers/mmc/core/core.c”

“mmc_cqe_recovery” means “Recover from CQE errors”.

It doesn't happen on all of the boards.

Have you seen this message before?

Does it mean something wrong? or potential risk?
Or this message is normal?

BR,

Richard

Guedes · ‎12-02-2021

Are there any solutions for that? I am struggling with this issue happening at random in my system.

mb1 · ‎12-14-2021

You can disable the Flag "Has eMMC command queue engine" in U-Boot with adding

'sdhci.debug_quirks=0x65168080'

to Kernel-Commandline. This will eliminate the error.

This only seems to happen for emmc version >= 5.0.

tlugaric · ‎12-13-2021

Hi Guedes

I got some support from Toradex regarding this, apparently the command queuing feature for mmc subsytem on kernel 4.14 from NXP is not working, so the workaround is to disable it in the sdhci-esdhc-imx using the attached patch.

As far as I am informed, they tried to reach out to NXP regarding fixing the driver but never managed to get any response from NXP.

igorpadykov · ‎05-15-2020

Hi Richard

there are no error messages so seems it can be ignored.

May be useful to look at linux mail lists:

mmc: Add Command Queue support [LWN.net]

[V4,09/11] mmc: block: Add CQE support - Patchwork

Also cqe description can be found googling linux_storage_system_analysis_emmc_command_queuing.pdf

document.

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

tlugaric · ‎02-17-2021

Hi Igor

I seem to have run into the same problem on the iMX8QM, only issue is that as well as "running CQE recovery". I am also getting a lot of errors: (this occurs on multiple pieces of hardware, whenever I run into a higher disk IO usage - for example ./stress -i 2).

Small extract from my dmesg:

[ 1220.796970] print_req_error: I/O error, dev mmcblk1, sector 41544440
[ 1220.796987] Aborting journal on device mmcblk1p4-8.
[ 1220.797033] mmc1: running CQE recovery
[ 1220.797152] mmc1: running CQE recovery
[ 1220.797269] mmc1: running CQE recovery
[ 1220.797339] print_req_error: I/O error, dev mmcblk1, sector 41538504
[ 1220.797347] Buffer I/O error on dev mmcblk1p4, logical block 633, lost sync page write
[ 1220.797363] JBD2: Error -5 detected when updating journal superblock for mmcblk1p4-8.

Complete dmesg output attached. After that, I have one or more partitions in read-only mode, sometimes I get IO errors preventing me from using the system. Partitions cannot be remounted back to RW mode. After reboot, the system works fine, but as soon as I run a stress test, errors come back again. FSCK finds no issue on the disk.

Andre123 · ‎12-05-2023

Hello, have you solved your problem? I had the same problem.

I hope to get your help.

thank you.

richard_hu · ‎05-17-2020

Thanks igorpadykov‌ ~!!! :smileyhappy:

eMMC message: "mmc2: running CQE recovery"

eMMC message: "mmc2: running CQE recovery"

i.MX 8M | i.MX 8M Mini | i.MX 8M Nano