Hello,
We would like to know how to enable ramoops on ls1046a with linux kernel 5.4.3.
Here are my settings and test but there is nothing on "/sys/fs/pstore".
1. Enabe pstore in kernel
CONFIG_PSTORE=y
CONFIG_PSTORE_CONSOLE=y
CONFIG_PSTORE_RAM=y
2. Pass module parameters from boot loader env
"ramoops.mem_address=0xc0000000 ramoops.mem_size=0x20000"
and I can see below kernel messages while booting,
[ 0.465329] ramoops: using module parameters
[ 1.020523] pstore: Registered ramoops as persistent store backend
[ 1.026735] ramoops: using 0x20000@0xc0000000, ecc: 0
3. Trigger kernel panic by sysrq
# echo c > /proc/sysrq-trigger
4. Waiting watchdog to reboot system (it should be warm boot)
5. Mount pstore fs
# mount -t pstore pstore /sys/fs/pstore
But there is no any ramoops files in pstore folder.
Could anyone show me how to enable this feature?
Thanks.
Regards,
Leo
NXP LSDK supports warm reset only on the LX2160AQDS board.
See the Section 5.2.2.1 in LSDK 2012 User Guide:
https://www.nxp.com/docs/en/user-guide/LSDKUG_Rev20.12.pdf
Use use kdump/kexec instead. See the following page:
Hi Pavel
Does it mean ls1046a cannot enable ramoops due to hw limitation?
If yes, I will try kdump instead.
Thanks.
Yes, use kdump instead of ramoops on the LS1046a board. NXP LSDK for the LS1046a does not supports warm reset.
Hi Pavel
After passing kernel arguments and use relocatable kernel as below,
I can manually run second kernel by kexec successfully.
But It cannot automatically run second kernel when a panic occur.
Something wrong with kexec command?
- kernel arguments
crashkernel=256M
- kernel option (kdump with relocatable kernel)
CONFIG_KEXEC=y
CONFIG_SYSFS=y
CONFIG_DEBUG_INFO=Y
CONFIG_CRASH_DUMP=y
CONFIG_PROC_VMCORE=y
CONFIG_RELOCATABLE=y
CONFIG_PROC_KCORE=y
- manually run second kernel and it is ok
1. kexec -l /boot/image.bin --command-line="root=/dev/mmcblk0p2 maxcpus=1 bootmode=kdump"
2. kexec -e
[ 1714.583001] kexec_core: Starting new kernel
... boot ok
- automatically run second kernel but failed
1. kexec -p /boot/image.bin --command-line="root=/dev/mmcblk0p2 maxcpus=1 bootmode=kdump"
2. echo c > /proc/sysrq-trigger
[ 1400.106321] sysrq: Trigger a crash
[ 1400.109717] Kernel panic - not syncing: sysrq triggered crash
[ 1400.115458] CPU: 1 PID: 1213 Comm: sh Kdump: loaded Tainted: G O 5.4.3 #1
[ 1400.123540] Hardware name: LS1046A TANG Board (DT)
[ 1400.128321] Call trace:
... stuck here.
See the Section 7.3 in LSDK 2012 User Guide:
Hi Pavel
Thanks again.
I've seen and tried it before but it gave me the same result.
I can manually load second kernel by "kexec -l" and "kexec -e",
but it also stuck when I trigger kernel panic by sysrq after loading second kernel by "kexec -p".
# echo c > /proc/sysrq-trigger
[ 138.910837] sysrq: Trigger a crash
[ 138.914238] Kernel panic - not syncing: sysrq triggered crash
[ 138.919983] CPU: 3 PID: 1216 Comm: sh Kdump: loaded Tainted: G O 5.4.3 #1
[ 138.928069] Hardware name: LS1046A TANG Board (DT)
[ 138.932853] Call trace:
[ 138.935299] dump_backtrace+0x0/0x140
[ 138.938956] show_stack+0x14/0x20
[ 138.942269] dump_stack+0xb4/0xf8
[ 138.945580] panic+0x158/0x324
[ 138.948631] sysrq_handle_reboot+0x0/0x20
[ 138.952635] __handle_sysrq+0x88/0x180
[ 138.956378] write_sysrq_trigger+0x8c/0xb0
[ 138.960472] proc_reg_write+0x78/0xb0
[ 138.964129] __vfs_write+0x18/0x40
[ 138.967526] vfs_write+0xdc/0x1c8
[ 138.970834] ksys_write+0x68/0xf0
[ 138.974144] __arm64_sys_write+0x18/0x20
[ 138.978064] el0_svc_common.constprop.0+0x68/0x160
[ 138.982851] el0_svc_handler+0x20/0x80
[ 138.986595] el0_svc+0x8/0xc
[ 138.989478] SMP: stopping secondary CPUs
// it stuck here and no message like "Starting crashdump kernel...".
Is there problem for kexec testing using the Section 7.3 from LSDK 2012 User Guide?
Is there problem to pass "Test Procedure" from this Section?
in "Test Procedure", I can successfully load dump-capture kernel in step 3 by kexec (-l/-e).
But in step 4 the dump-capture kernel can not be loaded automatically when panic occur in the first kernel.
It looks like that it is not possible,
I have not found information for using kexec automatically.
I found it stucks at "machine_kexec_mask_interrupts" function in machine_kexec.c
when kexec is trying to launch second kernel, but manually run second kernel by kexec -e has no problem.
Does it seem something wrong to disable irq when kernel panic occurs?