Hi NXP Teams
Recently we face a problem casued by CQHCI feature.
kernel version: 6.1.68
platform: S32G3
emmc model: FEMDME008G-A8A39
We used sysbench to perform pressure test on emmc and emmc keeped outputting dump information.
dump messages are below:
[ 232.419161] mmc0: cqhci: timeout for tag 4, qcnt 25
[ 232.423677] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 232.430083] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 232.436501] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 232.442924] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 232.449346] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 232.455799] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 232.462202] mmc0: cqhci: Doorbell: 0x81dfffff | TCN: 0x00000000
[ 232.468619] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00800000
[ 232.475067] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 232.481469] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 232.487896] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 232.494330] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 232.500739] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 232.507158] mmc0: sdhci: Sys addr: 0x9ca3e000 | Version: 0x00000002
[ 232.513578] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 232.520001] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 232.526424] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 232.532847] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 232.539279] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 232.545694] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 232.552126] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 232.558568] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 232.564973] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 232.571391] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 232.577837] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 232.584242] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 232.590660] mmc0: sdhci: Host ctl2: 0x00000008
[ 232.595104] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad464408
[ 232.601514] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 232.609066] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 232.614702] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 232.620431] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 232.626246] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 232.631888] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 232.637616] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 232.643345] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 232.649595] mmc0: sdhci: ============================================
[ 232.656073] mmc0: running CQE recovery
[ 289.818104] mmc0: running CQE recovery
[ 423.303076] mmc0: cqhci: timeout for tag 31, qcnt 27
[ 423.307677] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 423.314085] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 423.320504] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 423.326927] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 423.333381] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 423.339786] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 423.346379] mmc0: cqhci: Doorbell: 0xff1fcfff | TCN: 0x00000000
[ 423.352628] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000400
[ 423.359047] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 423.365465] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 423.371888] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 423.378309] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 423.384736] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 423.391160] mmc0: sdhci: Sys addr: 0xa5b7b000 | Version: 0x00000002
[ 423.397583] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 423.404005] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 423.410426] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 423.416849] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 423.423280] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 423.429696] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 423.436119] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 423.442568] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 423.449210] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 423.455401] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 423.461812] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 423.468236] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 423.474657] mmc0: sdhci: Host ctl2: 0x00000008
[ 423.479083] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad467408
[ 423.485508] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 423.493061] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 423.498701] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 423.504429] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 423.510245] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 423.515887] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 423.521615] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 423.527344] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 423.533593] mmc0: sdhci: ============================================
[ 423.540075] mmc0: running CQE recovery
[ 501.799092] mmc0: cqhci: timeout for tag 31, qcnt 24
[ 501.803698] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 501.810106] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 501.816528] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 501.822948] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 501.829398] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 501.835801] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 501.842221] mmc0: cqhci: Doorbell: 0xe01fffff | TCN: 0x00000000
[ 501.848666] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00008000
[ 501.855070] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 501.861533] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 501.867916] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 501.874333] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 501.880756] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 501.887180] mmc0: sdhci: Sys addr: 0xa5a15000 | Version: 0x00000002
[ 501.893600] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 501.900029] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 501.906450] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 501.912875] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 501.919301] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 501.925717] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 501.932139] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 501.938564] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 501.944994] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 501.951435] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 501.957841] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 501.964284] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 501.970686] mmc0: sdhci: Host ctl2: 0x00000008
[ 501.975106] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad463808
[ 501.981529] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 501.989083] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 501.994728] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 502.000451] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 502.006266] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 502.011909] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 502.017636] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 502.023365] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 502.029650] mmc0: sdhci: ============================================
[ 502.036090] mmc0: running CQE recovery
[ 607.207068] mmc0: cqhci: timeout for tag 31, qcnt 21
[ 607.211666] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 607.218083] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 607.224498] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 607.230915] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 607.237344] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 607.243803] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 607.250194] mmc0: cqhci: Doorbell: 0x83ffc1fe | TCN: 0x00000000
[ 607.256635] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000001
[ 607.263039] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 607.269485] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 607.275915] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 607.282328] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 607.288731] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 607.295180] mmc0: sdhci: Sys addr: 0x9cefe000 | Version: 0x00000002
[ 607.301577] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 607.308021] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 607.314425] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 607.320843] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 607.327263] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 607.333725] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 607.340117] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 607.346575] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 607.352962] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 607.359405] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 607.365818] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 607.372254] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 607.378692] mmc0: sdhci: Host ctl2: 0x00000008
[ 607.383103] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad463408
[ 607.389505] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 607.397083] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 607.402704] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 607.408454] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 607.414245] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 607.419880] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 607.425607] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 607.431335] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 607.437586] mmc0: sdhci: ============================================
[ 607.444076] mmc0: running CQE recovery
We noticed that all dumps occurred when CMD13 was sent and then emmc responsed timeout. We tried to use debugfs to find out reason and we noticed that only on the occasion that we execute “cat /sys/kernel/debug/mmc0/mmc0\:0001/status” to send a CMD13 to emmc which takes about 10 seconds to return and display result while in other cases we excute other command such as "cat /sys/kernel/debug/mmc0/mmc0\:0001/state" which returns immediatly.
Afterall, we closed CQHCI and solved the problem. It seems that CQHCI doesn't work well on emmc produced by specific manufacture. We would like to see new progress on this issue.
Best Regards,
Yan
Solved! Go to Solution.
maybe this patch fixed the problem:
https://github.com/nxp-auto-linux/linux/commit/185b9d9a592ca5eb97e67f9d43616d2c3de34ec8
but CQHCI is not stable, NXP suggest to close it. looking follow patch:
https://github.com/nxp-auto-linux/linux/commit/d856f42b65e5a976a49adfd9e95d1d81a27a9ca9
Hello, @yan4
I did not reproduce the issue on local RDB, but once met such issues on other NXP platforms. I suggest disabling the CQHCI temporally.
Thanks for @jun13_chen's inputs, may close the feature with the patch mentioned.
BR
Chenyin
Hello, @yan4
Thanks a lot for your valuable post, I could understand the status, and would check if it existed on refernece boards.
Thanks
BR
Chenyin
Hello,chenyin_h
Have you confirmed whether this problem exits on board previously mentioned or not? If you need some extra information, I would be glad to provide.
Best Regards
Yan
Hello, @yan4
Thanks for your post.
I noticed that the 6.1.68 kernel is used during your test, which seems not a formally supported version from BSP perspective.
May I know if you also met the issue while using the BSP(, for example, BSP41 with 5.15.x or 6.6.x kenel)?
BR
Chenyin
maybe this patch fixed the problem:
https://github.com/nxp-auto-linux/linux/commit/185b9d9a592ca5eb97e67f9d43616d2c3de34ec8
but CQHCI is not stable, NXP suggest to close it. looking follow patch:
https://github.com/nxp-auto-linux/linux/commit/d856f42b65e5a976a49adfd9e95d1d81a27a9ca9
Hello,jun13_chen
This patch can indeed solve the problem. Thanks a lot.