Kernel page faults then oops when performing rpm -qa

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Kernel page faults then oops when performing rpm -qa

Jump to solution
1,842 Views
antonyking
Contributor I

I am currently seeing an issue when using SDK 2.0 on an LS1043ARDB reference board when running a fairly standard build (fsl-image-full + fsl-image-kernelitb). When I issue the rpm -aq command I see a lot of page faults being issued by the kernel sometimes ending in an oops. The following is the kind of kernel stack trace I see:

[   96.377478] BUG: Bad page map in process rpm  pte:a481a481a481a481 pmd:e00000e1a00fd3

[   96.385321] addr:0000ffffb1600000 vm_flags:00100071 anon_vma:ffff800072aab9c8 mapping:          (null) index:ffffb1600

[   96.396019] file:          (null) fault:          (null) mmap:          (null) readpage:          (null)

[   96.405497] CPU: 1 PID: 1796 Comm: rpm Tainted: G S              4.1.8-rt8+gbd51baf #1

[   96.413409] Hardware name: LS1043A RDB Board (DT)

[   96.418101] Call trace:

[   96.420543] [<ffff800000089808>] dump_backtrace+0x0/0x11c

[   96.425939] [<ffff800000089934>] show_stack+0x10/0x1c

[   96.430981] [<ffff8000007eaff8>] dump_stack+0x7c/0x98

[   96.436030] [<ffff800000160870>] print_bad_pte+0x164/0x200

[   96.441506] [<ffff800000161f18>] vm_normal_page+0x70/0xa0

[   96.446901] [<ffff800000162270>] unmap_single_vma+0x328/0x710

[   96.452807] [<ffff800000162e90>] unmap_vmas+0x54/0xc4

[   96.459258] [<ffff8000001680c0>] unmap_region+0x8c/0x140

[   96.465930] [<ffff80000016a204>] do_munmap+0x240/0x3b8

[   96.472427] [<ffff80000016a3bc>] vm_munmap+0x40/0x64

[   96.478773] [<ffff80000016b228>] SyS_munmap+0x20/0x34

[   96.485198] Disabling lock debugging due to kernel taint

[   96.418101] Call trace:

I only ever see SyS_munmap in the stack dump.

This smells like a H/W configuration issue (running too fast maybe) but I have not changed anything in this area and the boards I have encountered this fault on were perfectly happy with the LS1043A 0.5 SDK.

Note that RPM is doing a lot of mmap/mprotect/munmap system calls so it is likely to be stressing the memory sub-system more than the other applications I have tried.

Any tips on where to look or solutions gratefully received.

Platform Details:

  • LS1043ARDB
  • FCW unchanged since delivery of the board
  • Standard kernel with systemd enabled (following systemd guidance regarding kernel config options) and changed openvswitch to be a loadable module
  • Standard fsl-image-full file system but with systemd enabled instead of sysvinit (plus some additional packages added such as openvswitch, networkmanager, openvpn and some other odds and sods)
  • Kernel booting off SD card using SD card variant of U-Boot (run from SD card) along with firmware version fsl_fman_ucode_t2080_r1.1_108_4_5.bin (loaded from SD-card)

Note that I have performed a full rebuild of all binaries direct from source.

Labels (1)
0 Kudos
1 Solution
1,262 Views
antonyking
Contributor I

This is caused by a bug in the Linux kernel version (4.1.8) shipped with version 2.0 of the Linux SDK and can be fixed by applying the following patch to the 4.1 kernel sources:

View solution in original post

0 Kudos
1 Reply
1,263 Views
antonyking
Contributor I

This is caused by a bug in the Linux kernel version (4.1.8) shipped with version 2.0 of the Linux SDK and can be fixed by applying the following patch to the 4.1 kernel sources:

0 Kudos