i.MX8 Kernel panic: unable to handle kernel paging request

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX8 Kernel panic: unable to handle kernel paging request

7,618 Views
Tobi_Edu
Contributor I

Dear community,

I have a custom board with an i.MX8 DualX. U-Boot is running and the linux kernel is able to boot. I use Linux 5.10.72_2.2.0 via Yocto together with SCFW Porting Kit 1.11.0.

Unfortunately, the kernel always crashes after several seconds up to half an hour with different panic messages. Mostly, the message is something like the following:

Unable to handle kernel paging request at virtual address 000000000000698b
[ 20.370464] Mem abort info:
[ 20.373259] ESR = 0x96000004
[ 20.376319] EC = 0x25: DABT (current EL), IL = 32 bits
[ 20.381633] SET = 0, FnV = 0
[ 20.384692] EA = 0, S1PTW = 0
[ 20.387835] Data abort info:
[ 20.390712] ISV = 0, ISS = 0x00000004
[ 20.394552] CM = 0, WnR = 0
[ 20.397526] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000085c66000
[ 20.403970] [000000000000698b] pgd=0000000000000000, p4d=0000000000000000
[ 20.410775] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 20.416349] Modules linked in:
[ 20.419414] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.72-lts-5.10.y+ga68e31b63f86 #1
[ 20.427599] Hardware name: Freescale i.MX8DX MEK (DT)
[ 20.432660] pstate: 80000085 (Nzcv daIf -PAN -UAO -TCO BTYPE=--)
[ 20.438682] pc : calc_global_load+0x18c/0x210
[ 20.443041] lr : calc_global_load+0x178/0x210
[ 20.447398] sp : ffff800011d5bee0
[ 20.450717] x29: ffff800011d5bee0 x28: ffff800011b52380
[ 20.456042] x27: ffff800011b52380 x26: ffff800011d5c000
[ 20.461367] x25: ffff800011d58000 x24: ffff800011b49360
[ 20.466693] x23: ffff800011cee000 x22: ffff800011cee000
[ 20.472018] x21: ffff800011b46000 x20: ffff800011b46a00
[ 20.477344] x19: 00000004b75a0aee x18: 0000000000000000
[ 20.482669] x17: 0000000000000000 x16: 0000000000000000
[ 20.487995] x15: 0000000fee30533a x14: 00000000000215a2
[ 20.493320] x13: 00000000000007f5 x12: 00000000fffef377
[ 20.498645] x11: 00000000000060cb x10: ffff800011cc88e0
[ 20.503971] x9 : 00000000fffef85a x8 : 0000000000000042
[ 20.509296] x7 : ffff800011cc88c0 x6 : 00000000000000c7
[ 20.514621] x5 : 00000000003fa800 x4 : 000000000002ad29
[ 20.519947] x3 : 0000000000000000 x2 : 0000000000000800
[ 20.525272] x1 : 00000000000004e3 x0 : 0000000000000055
[ 20.530598] Call trace:
[ 20.533055] calc_global_load+0x18c/0x210
[ 20.537076] do_timer+0x20/0x30
[ 20.540222] tick_do_update_jiffies64.part.0+0x78/0x114
[ 20.545449] tick_irq_enter+0xf0/0x130
[ 20.549203] irq_enter_rcu+0x64/0x70
[ 20.552780] irq_enter+0x14/0x20
[ 20.556014] __handle_domain_irq+0x40/0xe0
[ 20.560114] gic_handle_irq+0xc0/0x140
[ 20.563867] el1_irq+0xcc/0x180
[ 20.567014] arch_cpu_idle+0x18/0x30
[ 20.570591] default_idle_call+0x24/0x6c
[ 20.574518] do_idle+0x230/0x2a0
[ 20.577749] cpu_startup_entry+0x24/0x70
[ 20.581675] rest_init+0xd8/0xe8
[ 20.584909] arch_call_rest_init+0x10/0x1c
[ 20.589007] start_kernel+0x4ac/0x4e4
[ 20.592682] Code: d2809c61 9b013129 f90004e9 d5033abf (b948c160)
[ 20.598788] ---[ end trace 5863192a640cb186 ]---
[ 20.603411] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 20.610290] SMP: stopping secondary CPUs

The "virtual address" is not always the same. Also, the call trace is not always the same, but mostly, the last function is something timer-related.

Sometimes, the panic message is different:

[ 20.901152] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000060
[ 20.909965] Mem abort info:
[ 20.912759] ESR = 0x96000004    ...

Two complete boot logs are attached.

I use 2*512MB (1GB) of DDR3L memory. The DDR stress test was successfully executed for 2 hours, which is why I think, a hardware issue is improbable.

The memory node in the device tree is:

memory@80000000 {
   device_type = "memory";
   reg = <0x00000000 0x40000000>;
};

We checked the DCD file several times and did not find any wrong configurations.

RAM-Config in U-Boot is the following:

#define CONFIG_SYS_SDRAM_BASE 0x80000000
#define PHYS_SDRAM_1 0x80000000
#define PHYS_SDRAM_2 0x880000000

#define PHYS_SDRAM_1_SIZE 0x40000000  /* 1 GB */
#define PHYS_SDRAM_2_SIZE 0x00000000  /* 0 GB */

and

CONFIG_NR_DRAM_BANKS=4

The performance was improved a little bit by including CONFIG_DEBUG_PAGEALLOC=y, I think.

What could be the problem? What else could I try?

Regards,

Tobi

0 Kudos
Reply
16 Replies

7,321 Views
sam_raf
Contributor I

Hello Tobi_Edu,

Have you tried increasing VDD_A72 voltage by 50 mV?

0 Kudos
Reply

7,557 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

Can you try this change?

memory@80000000 {
device_type = "memory";
reg = <0x0 0x80000000 0 0x40000000>;
};

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,554 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh,

thanks for your fast reply.

Yes, I attempted that, but the kernel is still crashing. Please find attached two of the crash logs.

Best regards,

Tobi

0 Kudos
Reply

7,551 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

Have you faced the same issue on IMX8 MEK board?

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,548 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh,

no, I don't have an i.MX8DX MEK, so I didn't try it with an MEK.

But I think this would not be useful, since the RAM configuration is different than on the MEKs and I also have different periphery hardware on different pins.

EDIT: Additionally, I found out that the memory node is automatically changed by U-Boot on kernel bootup. After booting, it is:

memory@80000000 {
device_type = "memory";
reg = <0x00 0x80200000 0x00 0x3fe00000>;
};

Regards,

Tobi

0 Kudos
Reply

7,544 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

What is the drive strength of DDR in DDR configuration file?

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,541 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh,

the drive strength is 40 Ohm (00 in RPA). But I also tried with 34 Ohm, which didn't change anything.

DDR stress test was successful with 40 Ohm.

Regards,

Tobi

0 Kudos
Reply

7,519 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

DDR stress test was successful with 40 Ohm.

-> Have you done this with 2000MHz?

Have you performed Board level SI analysis?
 
Thanks & Regards.
Sanket Parekh
0 Kudos
Reply

7,514 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh ,

since I have DDR3L-RAM, I tested it only with 667 MHz and 800 MHz.

No, I have not performed a board level SI analysis.

Regards,

Tobi

 

0 Kudos
Reply

7,500 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

Please try with 120ohm DDR drive strength.

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,497 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh,

I already tried with every drive strength available in RPA. Also with several ODT-configurations.

Changes in those configurations didn't change anything in the crash behaviour.

Regards,

Tobi

0 Kudos
Reply

7,488 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

On MMC1 slot one SD card is connected right?

If yes, then can you please remove the same and try to reproduce the issue?

Please share the log file.

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,485 Views
Tobi_Edu
Contributor I

Hi @Sanket_Parekh,

thanks for your reply.

Yes, there is a micro SD card connected in slot MMC1, which is also the medium the device is booting from.

As image files, I use flash.bin, Image.bin, dtb and rootfs on separate partition. How can I make the device boot without inserted SD card (=flash all those files to eMMC)? UUU won't work, will it?

Regards,

Tobi

 

0 Kudos
Reply

7,478 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

Yes, Please flash the binaries into eMMC. UUU should work as per expectation.

Thanks & Regards

Sanket Parekh

0 Kudos
Reply

7,574 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @Tobi_Edu 

I hope you are doing well.

From the logs it seems, during log-in kernel crash happened.

Have you flashed imx8dx-mek binaries on your custom board?

Thanks & Regards.

Sanket Parekh

0 Kudos
Reply

7,573 Views
Tobi_Edu
Contributor I

Hello @Sanket_Parekh,

sometimes it crahes during log-in, sometimes I can log-in and the kernel crashes some seconds or minutes later.

I use the sources of the imx8dx-mek as a base for my custom code, so I use the custom binaries, not the ones from the MEK.

Regards,

Tobi

0 Kudos
Reply