Hello, I have a board that is allmost identicall from the i.MX25 PDK. The problem is that every once on a while i get a kernel panic error.... I really don't know what is causing it and I don't know how to go about finding out the reason.... I don't know if hardware issues should be rulled out since the board is working most of the time or what... I mean the SDRAM and the NFC are not exactlly the same... could it be a matter of timings... I have tried flashing the kernel/bootloader/rootfs originally generated by LTIB (without performing any modification whatsoever) and it still happens....
Here I am copying the error output:
Starting the hotplug events dispatcher udevd
Synthesizing initial hotplug events
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c3bac000
[00000000] *pgd=83ba6031, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in:
CPU: 0 Not tainted (2.6.31-207-g7286c01 #1)
PC is at file_free_rcu+0x1c/0x64
LR is at __rcu_process_callbacks+0x224/0x2b4
pc : [<c00b6f98>] lr : [<c0076120>] psr: 60000013
sp : c3b7ff48 ip : 20000093 fp : 0000000a
r10: 00000020 r9 : 00000008 r8 : c042169c
r7 : 00000001 r6 : 00000002 r5 : c38e4ce0 r4 : c3898540
r3 : 00000000 r2 : c380bd28 r1 : c0419900 r0 : c38852a0
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 0005317f Table: 83bac000 DAC: 00000015
Process path_id (pid: 998, stack limit = 0xc3b7e270)
Stack: (0xc3b7ff48 to 0xc3b80000)
ff40: c040048c c0076120 00000036 c3b7e000 00000102 c00761c4
ff60: 00000081 c004a6a0 c3b7e000 c03fbf7c c3b7e000 00000036 00000000 00360000
ff80: 00000001 00000000 c3b7e000 40024000 be8a465c c004b2fc c002706c c0027070
ffa0: 4013905c ffffffff fc400000 c0027bec 0000a2bd 40024894 05787df1 00000001
ffc0: 400248f0 00000000 00000000 00000003 af0fbe22 00000000 40024000 be8a465c
ffe0: be8a4690 be8a4598 400095b0 400090b8 20000010 ffffffff 804e5031 804e5431
[<c00b6f98>] (file_free_rcu+0x1c/0x64) from [<c0076120>] (__rcu_process_callbacks+0x224/0x2b4)
[<c0076120>] (__rcu_process_callbacks+0x224/0x2b4) from [<c00761c4>] (rcu_process_callbacks+0x14/0x38)
[<c00761c4>] (rcu_process_callbacks+0x14/0x38) from [<c004a6a0>] (__do_softirq+0xd4/0x1c8)
[<c004a6a0>] (__do_softirq+0xd4/0x1c8) from [<c004b2fc>] (irq_exit+0x44/0xa0)
[<c004b2fc>] (irq_exit+0x44/0xa0) from [<c0027070>] (_text+0x70/0x8c)
[<c0027070>] (_text+0x70/0x8c) from [<c0027bec>] (__irq_usr+0x4c/0x80)
Exception stack(0xc3b7ffb0 to 0xc3b7fff8)
ffa0: 0000a2bd 40024894 05787df1 00000001
ffc0: 400248f0 00000000 00000000 00000003 af0fbe22 00000000 40024000 be8a465c
ffe0: be8a4690 be8a4598 400095b0 400090b8 20000010 ffffffff
Code: e5903000 e3530000 ca000002 e3a03000 (e5833000)
Kernel panic - not syncing: Fatal exception in interrupt
Please help me....
hi,have you solved this problem?
Hi,
Since the NAND and DDR2 parts are different, you have to adjust things compare to the default software config of the i.MX25 PDK.
The timings of the DDR2 can be slightly different, even if most of the time it's standard. So, you should take care of this.
That should indeed be into u-boot, I guess it's in the DCD table.
There's typically nothing but the support of the NAND to add into u-boot, which is its ID, size, bus width,...
Timings are not sensitive for that slow interface.
It seems that your design uses 2 DDR2. As the default Linux BSP is made for a single part, there are also things to adjust there. But I can help with that.
Any thoughts?
Hi, thanks for your replies.
Krishna, I am not sure what environment variables you are referring to. Are you talking about u-boot's variables, or linux's variables. How could an environment variable affect me this way?
Fabio, I don't have a MX25PDK available right now wo I can't test this on one.. .however, I would think it would work as I have not modified any source code or kernel configuration (I haven't added any drivers/modules etc.).
The differences between the two boards are mainly the Nand Flash and the SDRAM. In my board the models used are:
SDRAM: MT47H32M16 (two of them hanging from CS2 and CS3)
Nand Flash: HYNIX HY27UF084G2B-TPCB (hanging from the NFC)
Could the problem have to do with timing issiues? as I have noticed the errors are much less likely to occur once the board has been working for a while? If this were the case, how should I fix it? would it be a matter of modifying the configuration of the ESDRAM and NFC controllers in the u-boot code?
Is there any application note or document I could read?
Thanks a lot in advances
Some suggestions:
1. Are you able to get this same crash on a MX25PDK?
2. When you say that your board is "almost identical" to MX25PDK: have you taken care of the changes in the bootloader and kernel?
Please could someone just tell me where to start? this board is supposed to be part of and industrial design which needs to be on 24/7 so the eventual crashes are totally unacceptable for us....
Please help....
PS: I turned on Kernel debugging in the "kernel hacking section" and I removed the PHY driver support as our board does not have the RMII transciever solded on... The system keeps crashing all the time but now the board does not actually hang after the crash. After every "crash" I get the logging prompt "freescale login:".