I've recently tried the 3.0.35 kernel from L3.0.35_12.09.03_ER. It is very unstable no matter what kernel configuration I'm using, even on kernel provided in debian package in L3.0.35_12.09.03_ER
It crashes with errors like:
Unable to handle kernel paging request at virtual address ffffffff
pgd = ba2f4000
[ffffffff] *pgd=4fffe821, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 Not tainted (3.0.35 #1)
PC is at anon_vma_clone+0x4c/0x158
LR is at anon_vma_clone+0x3c/0x158
pc : [<800e0504>] lr : [<800e04f4>] psr: a00f0013
sp : bff8be78 ip : 0bfd7000 fp : 80ad8d40
r10: 00000000 r9 : ba142cf0 r8 : baa561c8
r7 : bff0e840 r6 : 00000000 r5 : ffffffff r4 : bfa31f78
r3 : bff8a000 r2 : 80a7fc90 r1 : 8003d004 r0 : bfa31f78
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c53c7d Table: 4a2f404a DAC: 00000015
Process init (pid: 1, stack limit = 0xbff8a2f0)
Stack: (0xbff8be78 to 0xbff8c000)
be60: 2ab98fff bff0e878
be80: 80ad8d20 ba142cb8 bff0e840 ba142cb8 bff0e754 80acdd20 00000001 bff0e738
bea0: bc0b07dc 800e0630 bff0e840 bfc29040 ba142cb8 bff0e754 80acdd20 8007138c
bec0: bffd2054 00000020 bff0e744 bff0e758 bfc28000 bff8a000 bfc2803c bfc2907c
bee0: bc0178cc ba050000 01200011 80acdd20 bff8bf08 00000000 bff8bfb0 bff8a000
bf00: 7eabd718 80072118 bfd82c80 80a97fe0 ba0501d8 ba0501d0 bff8a000 00000000
bf20: 00000000 ba050124 00000020 00000000 bffd2000 01200011 00000000 2ab91360
bf40: 00000000 800416c4 bff8a000 00000000 2ad08000 80072534 2ab913c8 00000000
bf60: 00000000 bff8bf80 800416c4 7eabd91c 00000008 00000000 7eabd91c 80081910
bf80: 7ffbfeff fffffffe 00000000 00000000 0000000b 2ab913c8 7eabd720 2ab91360
<0>bfa0: 00000078 80041540 00000000 2ab913c8 01200011 00000000 00000000 00000000
bfc0: 2ab913c8 7eabd720 2ab91360 00000078 2ab91820 00000001 00000001 2ad08000
bfe0: 00000078 7eabd718 2ac97edf 2ac41276 000f0030 01200011 00000000 00000000
[<800e0504>] (anon_vma_clone+0x4c/0x158) from [<800e0630>] (anon_vma_fork+0x20/)
[<800e0630>] (anon_vma_fork+0x20/0x130) from [<8007138c>] (dup_mm+0x1a8/0x4d8)
[<8007138c>] (dup_mm+0x1a8/0x4d8) from [<80072118>] (copy_process+0x9e4/0xdb8)
[<80072118>] (copy_process+0x9e4/0xdb8) from [<80072534>] (do_fork+0x48/0x2a4)
[<80072534>] (do_fork+0x48/0x2a4) from [<80041540>] (ret_fast_syscall+0x0/0x30)
Code: e2504000 11a0a006 0a000024 e5985004 (e5956000)
---[ end trace 8479eb65fc3380dc ]---
Virtual addresses are different each time.
With kernel 3.0.15 and the same rootfs everything works perfectly. Have somebody experienced simillar issue?
Googling the issue suggests that the broken memory might be an issue, but I run the memcheck program on 3.0.15 kernel and it founds nothing.
I would really appreciate any help.
Regards,
Marcin Miklas
PS. More crashes attached.
Original Attachment has been moved to: 3.0.35_crashes.zip
Solved! Go to Solution.
This 12.09.03 release is for the i.MX6DL Beta release. Could you please use the i.MX6Q GA release, which has been released out recently.
After one night working, my imx6 board is alive!
Loop 23/500:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 175
root@freescale /home$ uname -a
Linux freescale 3.0.35-2039-g267e004 #1 SMP PREEMPT Sun Sep 23 07:23:09 CDT 2012 armv7l GNU/Linux
Do you think one night would be enough for test?
Do you have another board?
Could you, please, try to use the prebuilt rootfs, but not the Ubuntu one.
Let's work to figure out what is our difference.
Someone mentioned there might be a userspace mismatch, so I would try what Daiane suggested, use the images for uboot, kernel and rootfs ( non-Ubuntu ) from the L3.0.35_12.09.01_GA_images_MX6Q.tar.gz package.
I've already did. I've tried uboot + kernel + rootfs from L3.0.35_12.09.01_GA_images, effect is always the same - Unable to handle kernel paging request at virtual address - sometimes during booting, sometimes after some time. In case when booting goes without error, if you leave the board untouched it can last for ours. But when you try to run some applications or interact with them, the error happens almost immediately.
I've also written kernel directly to sdcard at 1MB and got the same error.
To summarize:
I've tried kernels from L3.0.35_12.09.01 and L3.0.35_12.09.03, both precompiled by Freescale and compiled by me (using various configuration, booth upgrading working configuration from 3.0.15 kernel, and using imx6_defconfig).
I've tried uboot from L3.0.35_12.09.03 and L3.0.35_12.09.01.
I've tried booting kernel from rootfs using 6q_bootscript, booting from tftp and booting 1MB offset of sdcard.
I've tried rootfs: provided by Freescale (rootfs.tar.gz and oneiric.tar.gz) and also mine based on ubuntu_core_12.04.
I've got always the same Unable to handle kernel paging request at virtual address ...
Daiane has the same revision of the board and for her everything works correctly.
There is for sure some difference in kernel. Some changes that makes use of something that is broken in my board? I think that for now I stay with 3.0.15 kernel. Unless someone has some completely different idea to try?
Thanks a lot all of you for ideas and testing.
Hi Marcin,
Sorry for all of the issues you're seeing. We're not generally seeing this issue elsewhere with any of the 3.0.35 kernels, so it's not clear whether you're doing something unique and exposing a bug (hardware or software) or whether there perhaps there's an issue with your board.
The reason for checking the userspace is that there are sometimes ABI changes between the kernel and userspace (planned or inadvertent) which cause problems if the two don't match.
The first log in the file 3.0.35_crashes.zip above shows that the failure happened in process tutorial4_es20 (one of the Vivante OpenGL examples):
Process tutorial4_es20 (pid: 5063, stack limit = 0xb99822f0)
Have you seen the crashes without using OpenGL? Since the OpenGL stack and Vivante driver do return kernel memory to userspace, I wonder if something you're doing to test is exposing a bug.
The nature of the crashes (random locations, random processes) makes it appear that something is stepping on kernel data structures.
It sounds as if you're using (or have used) a stock kernel and U-Boot (or multiple of them) and seeing the issue, but I haven't seen a complete set of testing steps that we could follow. Daiane and Richard have tried, but it seems that they've been unsuccessful in reproducing your results.
I'd like to do the same, but it seems we're missing something here.
I've seen the crash without OpenGL also, sometimes even before mounting rootfs.
I suspect some hardware issue with my board. But the mystery is that it works great with 2.6.38 and 3.0.15 kernels.
Can someone suggest some kind of automatic tests that I can run on 3.0.15 kernel to check board for hardware issues?
I've got the new Sabre Lite board rev D and it works like a charm with 3.0.35 kernel.
Hi Hui,
What's the origin of this U-Boot version?
U-Boot 2009.08-00619-ge4ab626-dirty (Oct 19 2012 - 15:56:24)
I can't seem to find e4ab626 in the commit logs from git://git.freescale.com/imx/uboot-imx.git.
Well, your u-boot is magical. No crashes so far. But it seems that OpenGL examples and accelerated video decoding doesn't work. But I need double check that.
More testing shows that it still crashes, but less frequently.
Bad thing happened.
I tried to restore my version of uboot, so I run upgradeu in uboot console to upgrade uboot from sdcard as usual, everything went ok, memory was erased, new uboot written, then read to verify if it was written correctly, script asked me to do reset and after reset the board didn't talk to me any more.
Nothing on serial console.
I've switched to USB OTG mode connected micro usb cable, connected other end to laptop, reset the board, but no freescale device is shown in lsusb and unbicking instructions doesn't work. Also windows machine doesn't detect it.
So I guess, my sabre lite died permanently.
Can you try both 0/1 and 1/0 positions for SW1 to see if you get the device to show up via USB?
Thanks for suggesting 1/0 settings for SW1, I was able to bring board back to life today. But since I'm 80% sure that I tried that settings on Friday also, I suspect that the board has some hardware issue.
Hey, guy. Try one more time from scratch!
You may not kill your board only changing bootloader...
Try again.
When you used the 12.09.01 images, did you burn that uboot into the SPI NOR as well? And then cycle power? Because it sounds like something was mismatched before.
Hmm, where I can find the attachment?
Just in my previous reply. There is attachment named: u-boot.bin.zip.
Thanks.
I will let the board run memtest overnight. But so far I got
root@freescale /home$ ./memtester 1024 200
memtester version 4.2.2 (32-bit)
Copyright (C) 2010 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 1024MB (1073741824 bytes)
got 657MB (689893376 bytes), trying mlock ...locked.
Loop 1/200:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
8-bit Writes : ok
16-bit Writes : ok
Loop 2/200:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : testing 6
My board is SABER LITE 10-18-11.
I´m using L3.0.35_12.09.03_ER_images_MX6Q pre built image.
I copied the no-padding uboot to 1K
I copied the uImage to 1M
And I placed the rootfs.ext2 content into a EXT4 partition on my SDcard (my board is "booting" from SD3
How much time does it take to crash your kernel?
From 2 seconds to 5 hours (when memtester is run). In 99% cases it is within first couple of minutes.