Query towards S32G3 - BSP 34 Kernel crash

kishanmishra · ‎12-06-2023

Information:

In reference to S32G399 hardware - QSPI NOR configuration for BSP34. Read/ write operations in boot-loader have succeeded. Similarly, when it is being attempted to access the QSPI NOR in Linux using 'mtd' driver then kernel is found to be crashing. Below is the detailed test-case attempted:

'dd' and 'hexdump' commands have been considered while noting the observations.
'hexdump' command is always crashing the kernel.
'dd' command is crashing the kernel when larger memory is being attempted for reading for example while using bs=1M.

Query:

With respect to BSP 34 is it a ERRATA or known issue?
1. With respect to EVAL boards also?
Let me know if the test case reported in above information is a kernel issue.
I would like to know the experience of the community in such cases.

The log of the kernel crash is provided below for your ready reference.

hexdump -Cv /dev/mtd7 > dumped.bin
[   18.490038]
[   18.490038] op->cmd.buswidth = 1
[   18.490059] op->addr.buswidth = 1
[   18.490062] op->dummy.buswidth = 1
[   18.490064] op->data.buswidth = 1
[   18.490070]
[   18.490070] op->cmd.buswidth = 1
[   18.490072] op->addr.buswidth = 1
[   18.490074] op->dummy.buswidth = 1
 
[   18.490075] op->data.buswidth = 1
B[   18.490083] allocated size = 67108864, byte read = 4096
r[   18.493157] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
o[   18.493163] printk: console [ttyLF0]: printing thread stopped
a[   18.536674] Modules linked in: llce_can llce_mailbox pfeng(O) llce_core fir_pci(O) can_isotp(O)
d[   18.551257] CPU: 0 PID: 313 Comm: hexdump Tainted: G           O      5.10.120-rt70+g0b76731696c1 #1
c[   18.560456] Hardware name: s32g3-zzz-zzzzzzzz (DT)
a[   18.565317] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
s[   18.571392] pc : __memcpy_fromio+0x48/0xa0
t[   18.575559] lr : s32gen1_exec_op+0x37c/0x3b0
[   18.579899] sp : ffffffc013e63740
m[   18.583284] x29: ffffffc013e63740 x28: ffffffc013e63e30
e[   18.588665] x27: 0000000000000005 x26: 0000000000000000
s[   18.594047] x25: ffffffc013e63b30 x24: ffffff88034ed140
s[   18.599428] x23: 00000000030f004c x22: 0000000000000000
a[   18.604810] x21: ffffffc011025000 x20: ffffff8800dbc600
g[   18.610192] x19: ffffff8800db8600 x18: 0000000000000020
e[   18.615573] x17: 0000000000000000 x16: 0000000000000000
[   18.620955] x15: ffffff88034ed5a8 x14: ffffffffffffffff
f[   18.626336] x13: ffffffc010d42536 x12: ffffffc010d4252f
r[   18.631719] x11: ffffffc010ca85e8 x10: ffffffc010cb5fe8
o[   18.637100] x9 : ffffffc013e63740 x8 : 000000000814b189
m[   18.642482] x7 : 0024f47300000000 x6 : 0000000000001000
[   18.647863] x5 : 0000000030c8b7b8 x4 : ffffffc017800000
s[   18.653244] x3 : ffffff8804585000 x2 : 0000000000001000
y[   18.658626] x1 : ffffffc017800000 x0 : ffffff8804584000
s[   18.664009] Call trace:
t[   18.666526]  __memcpy_fromio+0x48/0xa0
e[   18.670345]  spi_mem_exec_op+0x394/0x3f0
m[   18.674338]  spi_mem_dirmap_read+0x14c/0x1a0
d[   18.678677]  spi_nor_spimem_read_data+0xbc/0x144
-[   18.683365]  spi_nor_read+0xdc/0x174
j[   18.687010]  mtd_read_oob_std+0x78/0x84
o[   18.690917]  mtd_read_oob+0x7c/0x134
u[   18.694562]  mtd_read+0x48/0x7c
r[   18.697773]  mtdchar_read+0xcc/0x290
n[   18.701418]  vfs_read+0xac/0x1a0
a[   18.704717]  ksys_read+0x6c/0xfc
l[   18.708015]  __arm64_sys_read+0x20/0x30
d[   18.711921]  el0_svc_common.constprop.0+0x78/0x1c4
@[   18.716783]  do_el0_svc+0x24/0x8c
s[   18.720168]  el0_svc+0x14/0x20
3[   18.723293]  el0_sync_handler+0x1a4/0x1b0
2[   18.727372]  el0_sync+0x180/0x1c0
g[   18.730763] Code: aa0103e4 927df0c6 910020c6 8b060003 (f9400085)
3-zzz-zzzzzzzz (Wed 2020-12-16 18:25:30 CET):
 
kernel[265]: [   18.493157] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
 
[   18.736924] ---[ end trace 6ea34f12c4d59fc2 ]---
[   19.737073] printk: enabled sync mode
Segmentation fault
root@s32g3-zzz-zzzzzzzz:~#
Broadcast message from systemd-journald@s32g3-zzz-zzzzzzzz (Wed 2020-12-16 18:25:31 CET):
 
kernel[265]: [   18.730763] Code: aa0103e4 927df0c6 910020c6 8b060003 (f9400085)

eldorr

Hello!

I have a custom i.MX9-based board with a mt25ql02g QSPI NOR Flash (256 MByte).
I observe unstable SPI NOR Flash read that every time results in segfault (very similar to what you describe in your case) - e.g.

[root@imx93evk /mnt/swbank]# flashcp -A ./mxImage /dev/mtd3

<..... erase, and write always works - but then during "verify" ...>

[ 1156.397511] Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
[ 1156.405704] Modules linked in:
[ 1156.408751] CPU: 1 PID: 760 Comm: flashcp Not tainted 6.1.55-g2cb8508dd2db #6
[ 1156.415872] Hardware name: NXP i.MX93 11X11 EVK board (DT)
[ 1156.421342] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1156.428291] pc : __memcpy_fromio+0x58/0xb0
[ 1156.432390] lr : nxp_fspi_exec_op+0x89c/0xe50
[ 1156.436741] sp : ffff80000aa8b670
[ 1156.440043] x29: ffff80000aa8b6b0 x28: 0000aaaade35f2d0 x27: ffff80000aa8bdc0
[ 1156.447167] x26: ffff00000c1d49c0 x25: ffff0000084dd880 x24: ffff0000084dd580
[ 1156.454291] x23: ffff0000084dd000 x22: 0000000008000000 x21: ffff0000084dd5e8
[ 1156.461415] x20: 0000000000000800 x19: ffff80000aa8ba10 x18: 0000000000000000
[ 1156.468539] x17: 0000000000000000 x16: 00000000000001a0 x15: 0000000000000000
[ 1156.475663] x14: 0000000000000000 x13: 0000000000000002 x12: ffff000027e019a0
[ 1156.482787] x11: 0000000000000004 x10: 000000000000001d x9 : 00000000ffffff00
[ 1156.489911] x8 : 000000000000320a x7 : 0000000000000001 x6 : 0000000000000800
[ 1156.497035] x5 : 00ffffffffffffff x4 : ffff80000cdf0000 x3 : ffff000008870800
[ 1156.504159] x2 : 0000000000000800 x1 : ffff80000cdf0000 x0 : ffff000008870000
[ 1156.511284] Call trace:
[ 1156.513720] __memcpy_fromio+0x58/0xb0
[ 1156.517462] spi_mem_exec_op+0x39c/0x3f0
[ 1156.521380] spi_mem_no_dirmap_read+0xa0/0xc0
[ 1156.525730] spi_mem_dirmap_read+0xd4/0x140
[ 1156.529908] spi_nor_read_data+0x114/0x180
[ 1156.533990] spi_nor_read+0xb4/0x160
[ 1156.537552] mtd_read_oob_std+0x78/0x90
[ 1156.541382] mtd_read_oob+0x8c/0x150
[ 1156.544944] mtd_read+0x68/0xb0
[ 1156.548073] mtdchar_read+0x224/0x2a0
[ 1156.551730] vfs_read+0xc4/0x2c0
[ 1156.554954] ksys_read+0x74/0x110
[ 1156.558256] __arm64_sys_read+0x1c/0x30
[ 1156.562078] invoke_syscall+0x48/0x110
[ 1156.565823] el0_svc_common.constprop.0+0x44/0xf0
[ 1156.570520] do_el0_svc+0x2c/0xd0
[ 1156.573822] el0_svc+0x2c/0x90
[ 1156.576872] el0t_64_sync_handler+0x114/0x120
[ 1156.581214] el0t_64_sync+0x18c/0x190
[ 1156.584868] Code: 927df0c6 910020c6 8b060003 d503201f (f9400085)
[ 1156.590950] ---[ end trace 0000000000000000 ]---
Segmentation fault

The segfault appears to happen during "read-and-verify" of MTD content.
Now; I do not have too much statistics on this but it appears as if the MTD-partitions in the upper (+ 128 MByte upwards towards full) are more error prone, as I have not been able to (re)produce the same error for e.g. mtd0 (U-Boot), mtd1 (env), and the first 96 MByte partition thereafter (mtd2). However; mtd3 and mtd4 always fails.

My .dts patched as follows:

+ pinctrl_flexspi1: flexspi1grp {
+ fsl,pins = <
+ MX93_PAD_SD3_CLK__FLEXSPI1_A_SCLK 0x51e
+ MX93_PAD_SD3_CMD__FLEXSPI1_A_SS0_B 0x51e
+ MX93_PAD_SD3_DATA0__FLEXSPI1_A_DATA00 0x51e
+ MX93_PAD_SD3_DATA1__FLEXSPI1_A_DATA01 0x51e
+ MX93_PAD_SD3_DATA2__FLEXSPI1_A_DATA02 0x51e
+ MX93_PAD_SD3_DATA3__FLEXSPI1_A_DATA03 0x51e
+ >;

and

+&flexspi1 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_flexspi1>;
+ assigned-clock-rates = <80000000>;
+ status = "okay";
+
+ flash: mt25ql02g@0 {
+ compatible = "jedec,spi-nor";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ reg = <0>;
+ spi-max-frequency = <80000000>;
+ spi-tx-bus-width = <4>;
+ spi-rx-bus-width = <4>;
+
+ /* 2 MByte */
+ spl: partition@0 {
+ label = "uboot appl";
+ reg = <0x00000000 0x00200000>;
+ }; //end partition@00000000 (uboot_appl)
+
+ /* 64 KByte */
+ uboot_env: partition@200000 {
+ label = "uboot env";
+ reg = <0x00200000 0x00010000>;
+ }; //end partition@00200000 (uboot_env)
+
+ /* 96 MByte */
+ mx_image_1: partition@210000 {
+ label = "mx image 1";
+ reg = <0x00210000 0x06000000>;
+ }; //end partition@00210000 (mx_image_1)
+
+ /* 96 MByte */
+ mx_image_2: partition@6210000 {
+ label = "mx image 2";
+ reg = <0x06210000 0x06000000>;
+ }; //end partition@6210000 (mx_image_2)
+
+ /* 61 MByte */
+ fs: partition@C210000 {
+ label = "fs";
+ reg = <0x0C210000 0x03DF0000>;
+ }; //end partition@C210000 (fs)
+ }; //end mt25ql02g@0 (flash)
+};

QSPI NOR Flash detection (from dmesg):

[ 0.179377] spi-nor spi0.0: mt25ql02g (262144 Kbytes)
[ 0.179540] 5 fixed-partitions partitions found on MTD device 425e0000.spi
[ 0.179546] Creating 5 MTD partitions on "425e0000.spi":
[ 0.179552] 0x000000000000-0x000000200000 : "uboot appl"
[ 0.180579] 0x000000200000-0x000000210000 : "uboot env"
[ 0.181407] 0x000000210000-0x000006210000 : "mx image 1"
[ 0.182233] 0x000006210000-0x00000c210000 : "mx image 2"
[ 0.183095] 0x00000c210000-0x000010000000 : "fs"

I have tried out 133MHz (max) and now running at 80MHz - still same issue.

So ... my question to you ... did you ever get to the bottom of this?

Thanks.