Dear NXP,
We are using the i.MX8QXP processor on a custom board. We already have this platform working fine in a release based on your Linux kernel imx-5.4.70_2.3.0 and we never seen this error before. However after move this platform to the release based on lf-5.15.71-2.2.0 we start running in reboot issues.
i.MX8QXP platform throws the following unexpected kernel panic on reboot sequence:
[ 24.478472] EXT4-fs (mmcblk0p3): re-mounted. Opts: (null). Quota mode: none.
[ 24.491917] systemd-shutdown[1]: All filesystems unmounted.
[ 24.497605] systemd-shutdown[1]: Deactivating swaps.
[ 24.502790] systemd-shutdown[1]: All swaps deactivated.
[ 24.508068] systemd-shutdown[1]: Detaching loop devices.
[ 24.518080] systemd-shutdown[1]: All loop devices detached.
[ 24.523705] systemd-shutdown[1]: Stopping MD devices.
[ 24.529172] systemd-shutdown[1]: All MD devices stopped.
[ 24.534522] systemd-shutdown[1]: Detaching DM devices.
[ 24.540005] systemd-shutdown[1]: All DM devices detached.
[ 24.545445] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
[ 24.663301] systemd-shutdown[1]: Syncing filesystems and block devices.
[ 24.670193] systemd-shutdown[1]: Rebooting.
[ 24.674413] kvm: exiting hardware virtualization
[ 24.700220] ci_hdrc ci_hdrc.0: remove, state 4
[ 24.704699] usb usb1: USB disconnect, device number 1
[ 24.709769] usb 1-1: USB disconnect, device number 2
[ 24.716052] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
[ 29.920896] imx-scu scu: RPC send msg timeout
[ 35.040856] imx-scu scu: RPC send msg timeout
[ 35.045242] read temp sensor 355 failed, could be SS powered off, ret -110
[ 35.808882] 1v8_adc_vref: disabling
[ 40.160855] imx-scu scu: RPC send msg timeout
[ 45.280858] imx-scu scu: RPC send msg timeout
[ 45.285239] sdhc0: failed to power off resource 248 ret -110
[ 45.304882] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[ 45.312372] Modules linked in:
[ 45.315432] CPU: 2 PID: 1 Comm: systemd-shutdow Not tainted 5.15.71-00026-gfc817b9c105a-dirty #146
[ 45.331284] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 45.338251] pc : esdhc_readl_le+0x10/0x190
[ 45.342359] lr : sdhci_send_command+0x500/0xe4c
[ 45.346903] sp : ffff800009adb810
[ 45.350220] x29: ffff800009adb810 x28: ffff000001515000 x27: 0000000000000020
[ 45.357369] x26: 0000000000200001 x25: 0000000000000000 x24: ffff000001515810
[ 45.364521] x23: 000000000000000b x22: ffff000000488000 x21: 0000000000000001
[ 45.371669] x20: ffff800009adbb00 x19: ffff000001515580 x18: 0000000000000030
[ 45.378819] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000000
[ 45.385969] x14: 0000000000000371 x13: 0000000000000001 x12: 0000000000000001
[ 45.393118] x11: 0000000000000000 x10: 00000000000009e0 x9 : ffff800009adb9a0
[ 45.400268] x8 : ffff000000488a40 x7 : 0000000000000000 x6 : 0000000000000001
[ 45.407418] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff000001515810
[ 45.414568] x2 : ffff000001515580 x1 : 0000000000000024 x0 : ffff80000b420024
[ 45.421721] Call trace:
[ 45.424171] esdhc_readl_le+0x10/0x190
[ 45.427930] sdhci_send_command_retry+0x40/0x130
[ 45.432551] sdhci_request+0x70/0xc4
[ 45.436130] __mmc_start_request+0x68/0x140
[ 45.440326] mmc_start_request+0x84/0xb0
[ 45.444253] mmc_wait_for_req+0x70/0x100
[ 45.448180] mmc_wait_for_cmd+0x68/0xa0
[ 45.452020] __mmc_switch+0x1f0/0x23c
[ 45.455686] mmc_switch+0x28/0x40
[ 45.459005] _mmc_flush_cache+0x54/0x80
[ 45.462844] _mmc_suspend+0x58/0x2ec
[ 45.466424] mmc_shutdown+0x30/0x60
[ 45.469916] mmc_bus_shutdown+0x40/0x80
[ 45.473765] device_shutdown+0x158/0x330
[ 45.477700] __do_sys_reboot+0x1f0/0x294
[ 45.481638] __arm64_sys_reboot+0x24/0x30
[ 45.485649] invoke_syscall+0x48/0x114
[ 45.489402] el0_svc_common.constprop.0+0x44/0xec
[ 45.494121] do_el0_svc+0x24/0x90
[ 45.497438] el0_svc+0x20/0x60
[ 45.500498] el0t_64_sync_handler+0xb0/0xb4
[ 45.504683] el0t_64_sync+0x1a0/0x1a4
[ 45.508358] Code: aa0003e2 d503233f f9400c00 8b21c000 (b9400000)
[ 45.514466] ---[ end trace f85a9a543cdbcdcc ]---
[ 45.519082] note: systemd-shutdow[1] exited with preempt_count 1
[ 45.525101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 45.532772] Kernel Offset: disabled
[ 45.536254] CPU features: 0x00000001,20000846
[ 45.540617] Memory Limit: none
[ 45.543670] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
After debug the issue, we found that disabling the SET_RUNTIME_PM_OPS functions (sdhci_esdhc_runtime_suspend, sdhci_esdhc_runtime_resume) on sdhci-esdhc-imx.c driver, we never seen this kernel panic anymore, but we obtain a different issue related with a timeout in the SCU:
[ 28.830478] EXT4-fs (mmcblk0p3): re-mounted. Opts: (null). Quota mode: none.
[ 28.843592] systemd-shutdown[1]: All filesystems unmounted.
[ 28.849997] systemd-shutdown[1]: Deactivating swaps.
[ 28.855255] systemd-shutdown[1]: All swaps deactivated.
[ 28.861312] systemd-shutdown[1]: Detaching loop devices.
[ 28.871291] systemd-shutdown[1]: All loop devices detached.
[ 28.876919] systemd-shutdown[1]: Stopping MD devices.
[ 28.882396] systemd-shutdown[1]: All MD devices stopped.
[ 28.887735] systemd-shutdown[1]: Detaching DM devices.
[ 28.893216] systemd-shutdown[1]: All DM devices detached.
[ 28.898649] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
[ 28.914598] systemd-shutdown[1]: Syncing filesystems and block devices.
[ 28.921514] systemd-shutdown[1]: Rebooting.
[ 28.925778] kvm: exiting hardware virtualization
[ 28.956844] ci_hdrc ci_hdrc.0: remove, state 4
[ 28.961331] usb usb1: USB disconnect, device number 1
[ 28.966399] usb 1-1: USB disconnect, device number 2
[ 28.972711] ci_hdrc ci_hdrc.0: USB bus 1 deregistered
[ 34.273499] imx-scu scu: RPC send msg timeout
[ 34.277906] imx8qxp-pinctrl scu:pinctrl: pin_config_set op failed for pin 9
[ 34.284975] sdhci-esdhc-imx 5b010000.mmc: Error applying setting, reverse things back
[ 35.809505] 1v8_adc_vref: disabling
[ 39.393470] imx-scu scu: RPC send msg timeout
[ 39.397849] read temp sensor 497 failed, could be SS powered off, ret -110
[ 44.513472] imx-scu scu: RPC send msg timeout
[ 49.633479] imx-scu scu: RPC send msg timeout
[ 49.637869] imx8qxp-pinctrl scu:pinctrl: pin_config_set op failed for pin 9
[ 49.644858] sdhci-esdhc-imx 5b010000.mmc: Error applying setting, reverse things back
[ 49.652713] sdhci-esdhc-imx 5b010000.mmc: failed to activate pinctrl state default
After the several timeouts in the reboot process that delays it, the device reboots but by the watchdog.
We tested the SCU firmware v1.11.0 and v.1.15.0, but we obtain the same issue.
Q1: Could you help us to fix this issue?
According with our findings in the sdhci-esdhc-imx.c driver that points to the RUNTIME_PM functions...
Q2: Is there a race condition in shutdown process with the mmc clocks managed by sdhci-esdhc-imx.c driver and RUNTIME_PM functions?
Notice that we didn't initialize the M4 core so we don't know why the SCU throws the error: "imx8qxp-pinctrl scu:pinctrl: pin_config_set op failed for pin 9"
Q3: Do you know why the SCU fails setting that pinctrl?
Thanks in advance,
Arturo
Hi @arturobuzarra,
Thank you for contacting NXP Support.
Usually, this error is related to an incorrect control of the pin.
Here I will attach a reference that could help you with this problem:
Solved: [iMX8QM-MEK] pin_config_set op faild for pin 9 - NXP Community
Unfortunately, it is difficult to help you with the debugging of your custom board, but I will try to do my best to help with any question.
Hi Brian,
Thanks for your reply, but unfortunately we are not using this pin for anything in the SCU.
This is our pinctrl configuration for usdhc1:
/* eMMC */
pinctrl_usdhc1: usdhc1grp {
fsl,pins = <
IMX8QXP_EMMC0_CLK_CONN_EMMC0_CLK 0x06000041
IMX8QXP_EMMC0_CMD_CONN_EMMC0_CMD 0x00000021
IMX8QXP_EMMC0_DATA0_CONN_EMMC0_DATA0 0x00000021
IMX8QXP_EMMC0_DATA1_CONN_EMMC0_DATA1 0x00000021
IMX8QXP_EMMC0_DATA2_CONN_EMMC0_DATA2 0x00000021
IMX8QXP_EMMC0_DATA3_CONN_EMMC0_DATA3 0x00000021
IMX8QXP_EMMC0_DATA4_CONN_EMMC0_DATA4 0x00000021
IMX8QXP_EMMC0_DATA5_CONN_EMMC0_DATA5 0x00000021
IMX8QXP_EMMC0_DATA6_CONN_EMMC0_DATA6 0x00000021
IMX8QXP_EMMC0_DATA7_CONN_EMMC0_DATA7 0x00000021
IMX8QXP_EMMC0_STROBE_CONN_EMMC0_STROBE 0x00000041
>;
};
If I change the order of the pins I get the same error but with a different pin ID so it's not related to a specific pin ( See src/scfw_export_mx8qx/platform/config/mx8qx/pads.h
#define SC_P_EMMC0_CLK 9U /*!< CONN.EMMC0.CLK, CONN.NAND.READY_B, LSIO.GPIO4.IO07 */
#define SC_P_EMMC0_CMD 10U /*!< CONN.EMMC0.CMD, CONN.NAND.DQS, LSIO.GPIO4.IO08 */
#define SC_P_EMMC0_DATA0 11U /*!< CONN.EMMC0.DATA0, CONN.NAND.DATA00, LSIO.GPIO4.IO09 */
#define SC_P_EMMC0_DATA1 12U /*!< CONN.EMMC0.DATA1, CONN.NAND.DATA01, LSIO.GPIO4.IO10 */
#define SC_P_EMMC0_DATA2 13U /*!< CONN.EMMC0.DATA2, CONN.NAND.DATA02, LSIO.GPIO4.IO11 */
#define SC_P_EMMC0_DATA3 14U /*!< CONN.EMMC0.DATA3, CONN.NAND.DATA03, LSIO.GPIO4.IO12 */
#define SC_P_COMP_CTL_GPIO_1V8_3V3_SD1FIX0 15U /*!< */
#define SC_P_EMMC0_DATA4 16U /*!< CONN.EMMC0.DATA4, CONN.NAND.DATA04, CONN.EMMC0.WP, LSIO.GPIO4.IO13 */
#define SC_P_EMMC0_DATA5 17U /*!< CONN.EMMC0.DATA5, CONN.NAND.DATA05, CONN.EMMC0.VSELECT, LSIO.GPIO4.IO14 */
#define SC_P_EMMC0_DATA6 18U /*!< CONN.EMMC0.DATA6, CONN.NAND.DATA06, CONN.MLB.CLK, LSIO.GPIO4.IO15 */
#define SC_P_EMMC0_DATA7 19U /*!< CONN.EMMC0.DATA7, CONN.NAND.DATA07, CONN.MLB.SIG, LSIO.GPIO4.IO16 */
#define SC_P_EMMC0_STROBE 20U /*!< CONN.EMMC0.STROBE, CONN.NAND.CLE, CONN.MLB.DATA, LSIO.GPIO4.IO17 */
#define SC_P_EMMC0_RESET_B 21U /*!< CONN.EMMC0.RESET_B, CONN.NAND.WP_B, LSIO.GPIO4.IO18 */
In SCU FW we have the following board definition, where we don't reserve any pads for M4:
/*--------------------------------------------------------------------------*/
/* Configure the system (inc. additional resource partitions) */
/*--------------------------------------------------------------------------*/
void board_system_config(sc_bool_t early, sc_rm_pt_t pt_boot)
{
sc_err_t err = SC_ERR_NONE;
/* This function configures the system. It usually partitions
resources according to the system design. It must be modified by
customers. Partitions should then be specified using the mkimage
-p option. */
/* Note the configuration here is for NXP test purposes */
sc_bool_t alt_config = SC_FALSE;
sc_bool_t no_ap = SC_FALSE;
sc_bool_t ddrtest = SC_FALSE;
/* Get boot parameters. See the Boot Flags section for definition
of these flags.*/
(void) boot_get_data(NULL, NULL, NULL, NULL, NULL, NULL, &alt_config,
NULL, &ddrtest, &no_ap, NULL);
board_print(3, "board_system_config(%d, %d)\n", early, alt_config);
#if !defined(EMUL)
if (ddrtest == SC_FALSE)
{
uint64_t ram = hwid_get_ramsize(cc8x_ramid);
if (ram == 0) {
/* if RAM size was not coded, use variant to obtain RAM size */
if (cc8x_variant < ARRAY_SIZE(ccimx8x_variants_ram))
ram = ccimx8x_variants_ram[cc8x_variant];
}
sc_rm_mr_t mr_temp;
if (ram < SC_2GB) {
/* Board has less than 2GB so fragment lower region and delete */
BRD_ERR(rm_memreg_frag(pt_boot, &mr_temp, DDR_BASE0 + ram,
DDR_BASE0_END));
BRD_ERR(rm_memreg_free(pt_boot, mr_temp));
}
if (ram <= SC_2GB) {
/* Board has 2GB memory or less so delete upper memory region */
BRD_ERR(rm_find_memreg(pt_boot, &mr_temp, DDR_BASE1, DDR_BASE1));
BRD_ERR(rm_memreg_free(pt_boot, mr_temp));
}
else {
/* Fragment upper region and delete */
BRD_ERR(rm_memreg_frag(pt_boot, &mr_temp, DDR_BASE1 + ram
- SC_2GB, DDR_BASE1_END));
BRD_ERR(rm_memreg_free(pt_boot, mr_temp));
}
}
#endif
/* Name default partitions */
PARTITION_NAME(SC_PT, "SCU");
PARTITION_NAME(SECO_PT, "SECO");
PARTITION_NAME(pt_boot, "BOOT");
}
So I don't have these pins defined on any node other than usdhc1.
Could you provide us any guidance?
Thanks,
Arturo.