Hello, NXP support team:
The platform is "i.mx8mm".
When we write data to eMMC, there is message " mmc2: running CQE recovery" from kernel.
For example:
"dd" random data to eMMC
dd if=/dev/urandom of=/sd_test bs=1M count=500
[ 199.947978] mmc2: running CQE recovery
[ 199.953962] mmc2: running CQE recovery
[ 199.959086] mmc2: running CQE recovery
[ 200.171864] mmc2: running CQE recovery
I check the message in “drivers/mmc/core/core.c”
“mmc_cqe_recovery” means “Recover from CQE errors”.
It doesn't happen on all of the boards.
Have you seen this message before?
Does it mean something wrong? or potential risk?
Or this message is normal?
BR,
Richard
You can disable the Flag "Has eMMC command queue engine" in U-Boot with adding
'sdhci.debug_quirks=0x65168080'
to Kernel-Commandline. This will eliminate the error.
This only seems to happen for emmc version >= 5.0.
Hi Guedes
I got some support from Toradex regarding this, apparently the command queuing feature for mmc subsytem on kernel 4.14 from NXP is not working, so the workaround is to disable it in the sdhci-esdhc-imx using the attached patch.
As far as I am informed, they tried to reach out to NXP regarding fixing the driver but never managed to get any response from NXP.
Hi Richard
there are no error messages so seems it can be ignored.
May be useful to look at linux mail lists:
mmc: Add Command Queue support [LWN.net]
[V4,09/11] mmc: block: Add CQE support - Patchwork
Also cqe description can be found googling linux_storage_system_analysis_emmc_command_queuing.pdf
document.
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Hi Igor
I seem to have run into the same problem on the iMX8QM, only issue is that as well as "running CQE recovery". I am also getting a lot of errors: (this occurs on multiple pieces of hardware, whenever I run into a higher disk IO usage - for example ./stress -i 2).
Small extract from my dmesg:
[ 1220.796970] print_req_error: I/O error, dev mmcblk1, sector 41544440
[ 1220.796987] Aborting journal on device mmcblk1p4-8.
[ 1220.797033] mmc1: running CQE recovery
[ 1220.797152] mmc1: running CQE recovery
[ 1220.797269] mmc1: running CQE recovery
[ 1220.797339] print_req_error: I/O error, dev mmcblk1, sector 41538504
[ 1220.797347] Buffer I/O error on dev mmcblk1p4, logical block 633, lost sync page write
[ 1220.797363] JBD2: Error -5 detected when updating journal superblock for mmcblk1p4-8.
Complete dmesg output attached. After that, I have one or more partitions in read-only mode, sometimes I get IO errors preventing me from using the system. Partitions cannot be remounted back to RW mode. After reboot, the system works fine, but as soon as I run a stress test, errors come back again. FSCK finds no issue on the disk.