Hi NXP Teams
Recently we face a similar problem casued by CQHCI feature.
kernel version: 6.1.68
platform: S32G3
emmc model: FEMDME008G-A8A39
We used sysbench to perform pressure test on emmc and emmc keeped outputting dump information.
dump messages are below:
[ 232.419161] mmc0: cqhci: timeout for tag 4, qcnt 25
[ 232.423677] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 232.430083] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 232.436501] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 232.442924] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 232.449346] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 232.455799] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 232.462202] mmc0: cqhci: Doorbell: 0x81dfffff | TCN: 0x00000000
[ 232.468619] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00800000
[ 232.475067] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 232.481469] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 232.487896] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 232.494330] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 232.500739] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 232.507158] mmc0: sdhci: Sys addr: 0x9ca3e000 | Version: 0x00000002
[ 232.513578] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 232.520001] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 232.526424] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 232.532847] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 232.539279] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 232.545694] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 232.552126] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 232.558568] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 232.564973] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 232.571391] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 232.577837] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 232.584242] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 232.590660] mmc0: sdhci: Host ctl2: 0x00000008
[ 232.595104] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad464408
[ 232.601514] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 232.609066] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 232.614702] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 232.620431] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 232.626246] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 232.631888] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 232.637616] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 232.643345] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 232.649595] mmc0: sdhci: ============================================
[ 232.656073] mmc0: running CQE recovery
[ 289.818104] mmc0: running CQE recovery
[ 423.303076] mmc0: cqhci: timeout for tag 31, qcnt 27
[ 423.307677] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 423.314085] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 423.320504] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 423.326927] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 423.333381] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 423.339786] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 423.346379] mmc0: cqhci: Doorbell: 0xff1fcfff | TCN: 0x00000000
[ 423.352628] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000400
[ 423.359047] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 423.365465] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 423.371888] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 423.378309] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 423.384736] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 423.391160] mmc0: sdhci: Sys addr: 0xa5b7b000 | Version: 0x00000002
[ 423.397583] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 423.404005] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 423.410426] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 423.416849] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 423.423280] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 423.429696] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 423.436119] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 423.442568] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 423.449210] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 423.455401] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 423.461812] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 423.468236] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 423.474657] mmc0: sdhci: Host ctl2: 0x00000008
[ 423.479083] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad467408
[ 423.485508] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 423.493061] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 423.498701] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 423.504429] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 423.510245] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 423.515887] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 423.521615] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 423.527344] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 423.533593] mmc0: sdhci: ============================================
[ 423.540075] mmc0: running CQE recovery
[ 501.799092] mmc0: cqhci: timeout for tag 31, qcnt 24
[ 501.803698] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 501.810106] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 501.816528] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 501.822948] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 501.829398] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 501.835801] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 501.842221] mmc0: cqhci: Doorbell: 0xe01fffff | TCN: 0x00000000
[ 501.848666] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00008000
[ 501.855070] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 501.861533] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 501.867916] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 501.874333] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 501.880756] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 501.887180] mmc0: sdhci: Sys addr: 0xa5a15000 | Version: 0x00000002
[ 501.893600] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 501.900029] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 501.906450] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 501.912875] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 501.919301] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 501.925717] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 501.932139] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 501.938564] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 501.944994] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 501.951435] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 501.957841] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 501.964284] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 501.970686] mmc0: sdhci: Host ctl2: 0x00000008
[ 501.975106] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad463808
[ 501.981529] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 501.989083] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 501.994728] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 502.000451] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 502.006266] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 502.011909] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 502.017636] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 502.023365] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 502.029650] mmc0: sdhci: ============================================
[ 502.036090] mmc0: running CQE recovery
[ 607.207068] mmc0: cqhci: timeout for tag 31, qcnt 21
[ 607.211666] mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
[ 607.218083] mmc0: cqhci: Caps: 0x0000310a | Version: 0x00000510
[ 607.224498] mmc0: cqhci: Config: 0x00001001 | Control: 0x00000000
[ 607.230915] mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
[ 607.237344] mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
[ 607.243803] mmc0: cqhci: TDL base: 0x823c3000 | TDL up32: 0x00000000
[ 607.250194] mmc0: cqhci: Doorbell: 0x83ffc1fe | TCN: 0x00000000
[ 607.256635] mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000001
[ 607.263039] mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000
[ 607.269485] mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000900
[ 607.275915] mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
[ 607.282328] mmc0: cqhci: Resp idx: 0x0000000d | Resp arg: 0x00000000
[ 607.288731] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 607.295180] mmc0: sdhci: Sys addr: 0x9cefe000 | Version: 0x00000002
[ 607.301577] mmc0: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000020
[ 607.308021] mmc0: sdhci: Argument: 0x00018000 | Trn mode: 0x00000023
[ 607.314425] mmc0: sdhci: Present: 0x01fd8008 | Host ctl: 0x00000030
[ 607.320843] mmc0: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 607.327263] mmc0: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 607.333725] mmc0: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000
[ 607.340117] mmc0: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000
[ 607.346575] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000502
[ 607.352962] mmc0: sdhci: Caps: 0x07eb0000 | Caps_1: 0x0000b407
[ 607.359405] mmc0: sdhci: Cmd: 0x00000d1a | Max curr: 0x00ffffff
[ 607.365818] mmc0: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0xffffffff
[ 607.372254] mmc0: sdhci: Resp[2]: 0x329f5903 | Resp[3]: 0x00d0ffff
[ 607.378692] mmc0: sdhci: Host ctl2: 0x00000008
[ 607.383103] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0xad463408
[ 607.389505] mmc0: sdhci-esdhc-imx: ========= ESDHC IMX DEBUG STATUS DUMP =========
[ 607.397083] mmc0: sdhci-esdhc-imx: cmd debug status: 0x2120
[ 607.402704] mmc0: sdhci-esdhc-imx: data debug status: 0x2200
[ 607.408454] mmc0: sdhci-esdhc-imx: trans debug status: 0x2300
[ 607.414245] mmc0: sdhci-esdhc-imx: dma debug status: 0x2400
[ 607.419880] mmc0: sdhci-esdhc-imx: adma debug status: 0x2510
[ 607.425607] mmc0: sdhci-esdhc-imx: fifo debug status: 0x2680
[ 607.431335] mmc0: sdhci-esdhc-imx: async fifo debug status: 0x2750
[ 607.437586] mmc0: sdhci: ============================================
[ 607.444076] mmc0: running CQE recovery
We noticed that all dumps occurred when CMD13 was sent and then emmc responsed timeout. We tried to use debugfs to find out reason and we noticed that only on the occasion that we execute “cat /sys/kernel/debug/mmc0/mmc0\:0001/status” to send a CMD13 to emmc which takes about 10 seconds to return and display result while in other cases we excute other command such as "cat /sys/kernel/debug/mmc0/mmc0\:0001/state" which returns immediatly.
Afterall, we closed CQHCI and solved the problem. It seems that CQHCI doesn't work well on emmc produced by specific manufacture. We would like to see new progress on this issue.
Best Regards,
Yan