64MB flash problem in M5485EVB with Linux 2.6.25

jkimble · ‎06-02-2009

I've been having a very strange problem while trying to get 64MB of flash to work with a custom board based on the M5485EVB board (Intel P33 flash with CS lines tied together for 32 bit width). I had no problem with this for the 2.6.10 kernel so I know the hardware is OK. U-Boot had no problem with it but the 2.6.25 kernel doesn't want to see 64MB. I can get it to work with 38MB but no more than that (yeah, weird!!).

In this case the only thing I've had to do in the kernel is specify a length of 0x03ffffff rather than 0x04000000 because the larger size causes the mtd physmap routine to fail with a FAULT 5. The smaller size seems to allow everything to boot up fine.

Kernel boot is always OK when initializing partitions. I see:

---------------- snip ------------------------------------

Driver 'sd' needs updating - please use bus_type methods
physmap platform flash device: 03ffffff at fc000000
physmap-flash.0: Found 2 x16 devices at 0x0 in 32-bit bank
NOR chip too large to fit in mapping. Attempting to cope...
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Intel/Sharp Extended Query Table at 0x010A
Using buffer write method
Using auto-unlock on power-up/resume
cfi_cmdset_0001: Erase suspend on write enabled
Reducing visibility of 65536KiB chip to 65535KiB
2 cmdlinepart partitions found on MTD device physmap-flash.0
Creating 2 MTD partitions on "physmap-flash.0":
0x00000000-0x00400000 : "kernel"
0x00400000-0x03b00000 : "root"
DSPI: Coldfire master initialized

------------------- snip -------------------------------------

With an NFS kernel I'm able to specify a root partition of 60M in bootargs as in:

root=/dev/nfs rw nfsroot=137.237.242.53:/tftpboot/ltib ip=137.237.242.175:137.237.242.53:137.237.242.13:255.255.255.0:ColdFire:eth0ff mtdparts=physmap-flash.0:4M(kernel)ro,60M(root)

But when I burn the kernel and rootfs (21M) to flash and try to boot with a bootargs of (anything bigger than 38M fails):

root=/dev/mtdblock1 rw rootfstype=jffs2 mtdparts=physmap-flash.0:4M(kernel)ro,55M(root)

I get the following messages and the kernel panics:

-------------- snip ---------------------------------------------------------

RPC: Registered tcp transport module.
Bad page state in process 'swapper'
page:003bba70 flags:0x0000000c mapping:00000002 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Stack from 0702daf0:
       <0> 0702db00<0> 002e687c<0> 0005118c<0> 00281bfc<0> 003bba88<0> 003bba70<
0> 00051f52<0> 003bba70
       <0> 00000000<0> 000000d0<0> 00000000<0> 00000010<0> 002e6db0<0> 00000000<
0> 00000000<0> 0702a000
       <0> fc480000<0> 002e6db4<0> 00000004<0> 00000000<0> 00000000<0> 00000000<
0> 00000001<0> 00000000

;;;;

Totlen for ref at 071721f8 (0x00080000-0x575d5555) miscalculated as 0xffcd9759 i
nstead of 57555555
No next ref. jeb->last_node is 071721f8
jeb->wasted_size 2e68a8, dirty_size 3bbac0, used_size 1, free_size 2e68a8
------------[ cut here ]------------
WARNING: at fs/jffs2/nodelist.c:766 __jffs2_ref_totlen+0x286/0x340()
Modules linked in:
Stack from 0702dab8:
<0> 0702dac8<0> 071721f8<0> 0002c910<0> 0027d944<0> 00291d4a<0> 000002fe<
0> 0702dae9<0> 0027d91b

;;;;;

<0> 07134eae<0> 000c0000<0> 070255f5<0> 00000002<0> 00000000<0> 00000000<
0> 00000000<0> 0000000a
Call Trace:
       <0> [<0011519a>]<0> [<001179d6>]<0> [<00117382>]<0> [<001179d6>]
       <0> [<001941ea>]<0> [<00194432>]<0> [<001179d6>]<0> [<00077844>]
       <0> [<0007f7f8>]<0> [<0006c73e>]<0> [<00117882>]<0> [<001179d6>]
       <0> [<00071112>]<0> [<00071210>]<0> [<00051af2>]<0> [<000859de>]
       <0> [<00085b86>]<0> [<00083d20>]<0> [<00052416>]<0> [<00083dde>]
       <0> [<00085c56>]<0> [<000412f2>]<0> [<000790ac>]<0> [<00079592>]
       <0> [<0013f7c2>]<0> [<00021228>]<0> [<000795a8>]<0> [<001735b4>]
       <0> [<0002d1a6>]<0> [<0004afd6>]<0> [<000212a2>]<0> [<000212da>]
Kernel panic - not syncing: Attempted to kill init!

------------------ snip -------------------------------------------------

This is very weird!! Anyone have any ideas about what might be causing this?

kwong · ‎07-24-2009

I'm in a similar boat. I've gotten the 2.6.25 kernel up and running on my board but when I try enabling initramfs/initrd support + SLUB, I get the same type of problems... here are the startup messages:

Linux version 2.6.25-svn1-dirty2 (ken@ubuntu) (gcc version 4.2.3 (Sourcery G++ Lite 4.2-125)) #11 Thu Jul 23 23:27:30 EDT 2009
starting up linux startmem 0x2b4000, endmem 0x8000000,         size 125MB
console [ttyS0] enabled
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16312
Kernel command line: root=/dev/nfs
PID hash table entries: 512 (order: 9, 2048 bytes)
Console: colour dummy device 80x25
Dentry cache hash table entries: 16384 (order: 3, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 2, 32768 bytes)
Memory: 127496k/127536k available (1816k kernel code, 1616k data, 96k init)
SLUB: Genslabs=13, HWalign=16, Order=0-2, MinObjects=8, CPUs=1, Nodes=1
Mount-cache hash table entries: 1024
net_namespace: 152 bytes
NET: Registered protocol family 16
Linux/m68k PCI BIOS32 revision 0.05
ColdFire PCI Host Bridge (Rev. 0) detected:MEMBase d0000000,MEMLen 7ffffff,IOBase 0,IOLen ffff
arb_interrupt
init_coldfire_pci: MEMBase_phy d0000000, Virt d0000000, len 8010000
PCI: Probing PCI hardware
NET: Registered protocol family 23
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 0, 8192 bytes)
TCP established hash table entries: 4096 (order: 2, 32768 bytes)
TCP bind hash table entries: 4096 (order: 1, 16384 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
TCP reno registered
Bad page state in process 'swapper'
page:002cdb10 flags:0x00000400 mapping:00000000 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Stack from 0702df00:
       <0> 0702df10<0> 00000000<0> 00051c28<0> 00220455<0> 002622c4<0> 002cdb10<0> 000524bc<0> 002cdb10
       <0> 00000085<0> 00000000<0> 00000018<0> 002622c4<0> 00274085<0> 00052568<0> 002cdb10<0> 00000000
       <0> 00262f2e<0> 002cdb10<0> 00000000<0> 00000005<0> 00000001<0> 00000000<0> 00000000<0> 0021a077
       <0> 00262e40<0> 0002dc42<0> 0004ba72<0> 00041d8e<0> 00000000<0> 00000085<0> 00000000<0> 00000000
       <0> 00263894<0> 00274000<0> 00000085<0> 00000000<0> 00000000<0> 002729f8<0> 0002dc42<0> 0025e224
       <0> 00000000<0> 00000005<0> 00000001<0> 00000000<0> 00000001<0> 07f30f90<0> 0002146a<0> 0025e1a0
Call Trace:
       <0> [<000214a2>]
m547x_8x DMA: Initialize Multi-channel DMA API v1.0
Unable to handle kernel access at virtual address 30303030
Oops: 00000000
PC: [<00050cc6>] __rmqueue_smallest+0x72/0x120<0>
SR: 2700 SP: 0702dcd8 a2: 0702a000
d0: 00200200    d1: 00000003    d2: 00000000    d3: 00000002
d4: 00000000    d5: 00278cc8    a0: 00348cf1    a1: 00348d08
Process swapper (pid: 1, stackpage=0702c000)
Stack from 0702dcd8:
       <0> 00000003<0> 00000000<0> 00000002<0> 00000000<0> 00278cc8<0> 00348cf1<0> 00348d08<0> 0702a000
       <0> 00200200<0> ffffffff<0> 00000000<0> 30303030<0> 00000000<0> 480a2700<0> 00050cc6<0> 00000000
       <0> 00000000<0> 00000001<0> 00000000<0> 00278cc8<0> 0029f658<0> 00000000<0> 00278cec<0> 000515f6
       <0> 00278cc8<0> 00051616<0> 00278cc8<0> 00000000<0> 00000000<0> 00000000<0> 00000000<0> 00000001
       <0> 00000000<0> 00278cc8<0> 0029f658<0> 00000000<0> 00278cec<0> 000515f6<0> 00278cc8<0> 0005184c
       <0> 00278cc8<0> 00000000<0> 00000000<0> 00000000<0> 00002004<0> 00278cec<0> 00000000<0> 002791fc
Call Trace:
       <0> [<00052938>]<0> [<00052b8a>]<0> [<0005264c>]<0> [<0004bfe8>]
       <0> [<0006ab80>]<0> [<000fe3fa>]<0> [<0006b24e>]<0> [<00067196>]
       <0> [<00067196>]<0> [<00081070>]<0> [<00081c1e>]<0> [<00041d8e>]
       <0> [<000671cc>]<0> [<00067dbe>]<0> [<00067d0e>]<0> [<00070d6c>]
       <0> [<0004ba72>]<0> [<000709fe>]<0> [<00067d0e>]<0> [<0002dc42>]
       <0> [<0002146a>]<0> [<000214a2>]
Kernel panic - not syncing: Attempted to kill init!

I don't suppose you managed to figure out what the root cause was did you?

jkimble · ‎07-24-2009

Unfortunately no. I've not resolved this yet. I've moved my flash base address (so it's not at the very end of addressable memory), checked my initial flexbus masking size (and configuration). I've tried everything I can think of. Results are always the same. I can get 38MB of flash but no more.

Again, I had no trouble with this on 2.6.10 kernel. Can't image why I'm having to fight it with the later one.

truregs · ‎06-23-2009

Hi,

I have a similar problem with my board (which is limited to 32 MB of RAM ) but the filesystem size limit seems to be around 16MB.

I choose to try other allocator scheme (SLAB is the default) and when I turned on SLUB, everything worked fine.

you may try that :

In kernel configuration, General setup , choose SLAB allocator, SLUB(unqueued allocator)

As a (good) side effect, it appears to lower the number of UDP messages lost (as seen through iperf ) which is also a heavy kernel allocator user.

jkimble · ‎06-23-2009

Interesting... My kernel was already configured for SLUB. However when I changed it to SLAB I was then able to boot to the full file size capacity. Unfortunately it also takes about 5 minutes to go through the "Empty flash at..." messages and mount the file system and every file system operation I do seems to take forever.

I've traded one problem for a whole set of others. It's something to look at though. Maybe I've got other configurations that have to change to go from SLUB to SLAB. I don't even know what SLUB Vs SLAB is at the moment but I appreciate the suggestion. Gives me something new to look at anyway.

fsl_linux_spt · ‎06-25-2009

That is really weird.

I would definately stick with slub and not slab. The bsp changed to slub and saw a significant performance enhancment in the overall system. Not sure what the exact reason is on that.

When you boot via nfs is the kernel able to access all of the root partition? Can you manually access the partition at that point?

also, what does the flinfo show in uboot for your flash? Does the flash mapping in uboot line up with the physical start address of flash and the NOR flash base address in the kernel?

jkimble · ‎06-25-2009

It's actually stanger than that. When I boot with the bootargs set to:

root=/dev/nfs rw nfsroot=137.237.242.29:/tftpboot/ltib ip=137.237.242.177:137.237.242.29:137.237.242.13:255.255.255.0:ColdFire:eth0ff mtdparts=physmap-flash.0:4M(kernel)ro,59M(root)

Things go swimmingly. Boots up, no issues. However when I burn things to flash and start it up with bootargs set to:

root=/dev/mtdblock1 rw rootfstype=jffs2 mtdparts=physmap-flash.0:4M(kernel)ro,59M(root)

I get all those errors and kernel panic.

U-Boot (flinfo) shows the correct mapping (I had to change the base to 0xFC00 0000 because of 64MB).

This is really weird because I had no major problems with the 2.6.10 kernel with 64MB at all. Our flash was of a later CFI than 2.6.10 allowed but I was able to modify it to do so with no problems.

fsl_linux_spt · ‎06-26-2009

Have you tried posting this to the mtd mailing list?

http://www.linux-mtd.infradead.org/

jkimble · ‎06-26-2009

I have now. It just seems strange that going from 2.6.10 to 2.6.25 would raise this problem. I feels more like a kernel configuration issue than a bug in the flash subsystem.

If I learn anything more I'll repost here. Thanks,