NOTE: I'm responding to my own question to post direct responses that I received from Niels Roest, one of the developers at directfb.org. He declined to participate in the Freescale Coldfire forum but gave me permission to post his thoughts here:
/************* Niels Roest's response # 1 ****************/
Hi Allen,
just to give my thoughts on the matter.
Martin forwarded your mail, including the problem description at
http://forums.freescale.com/freescale/board/message?board.id=CFCOMM&thread.id=7192
Your analysis is correct; but I have no ready solution so we need to dig a bit.
in fusion.c, the offending "shared->refs = 1" explains the address of 0x20000004.
In answer to your question "Does anyone know the significance of address 0x20000000 in the mmap() call in fusion.c?": this is a reasonably arbitrary address, chosen to prevent the need to remap and recalculate addresses when passing them between processes. It has worked reasonably nicely in the past, only the sh4 chip and some mipses had a collision and were remapped, I believe, but only for the shared memory pools, not for this "fusion world" shared area of 2300 bytes.
What I would be interested in is your memory mapping right after the mmap.
If you put a sleep just before the write to "shared->refs", and dump /proc/<processnumber>/maps (or mmaps, or whatever), I am curious to see if there are any collisions at this address.
Greets
Niels
/************* My "follow-up" response # 1 ****************/
Hi Niels,
Thanks very much for your prompt response! ...
I added a debugging printf() and a 30 second sleep just before "shared->refs = 1" in "fusion.c"... Here are my results while running "dfbinfo"...
[NOTE: not posted because it looks the same as the output in my initial question/posting in this forum]
... and the output from calls to "ps" and "cat /proc/<pid>/maps" while "dbinfo" was sleeping (just prior to the segfault):
root@freescale:~# ps PID Uid VmSize Stat Command
1 root 840 S init 2 root SW< [kthreadd]
3 root SW< [ksoftirqd/0]
4 root SW< [watchdog/0]
5 root SW< [events/0]
6 root SW< [khelper]
36 root SW< [kblockd/0]
45 root SW< [kseriod]
63 root SW [pdflush]
64 root SW [pdflush]
65 root SW< [kswapd0]
119 root SW< [aio/0]
250 root SW< [mtdblockd]
258 root SW< [spi_coldfire]
271 root SW< [rpciod/0]
292 root 616 S /sbin/syslogd
295 root 592 S /sbin/klogd
317 root 712 S /usr/sbin/inetd
319 bin 440 S /sbin/portmap
323 root 688 S /usr/sbin/dropbear
329 root 1384 S -sh
332 root 968 S /usr/sbin/dropbear
333 root 968 S /usr/sbin/dropbear
335 root 1360 S -sh
336 root 1360 S -sh
355 root 1296 S dfbinput --dfb:debug
356 root 792 R ps
root@freescale:~#
root@freescale:~# cat /proc/355/maps
20000000-20002000 rw-s 00000000 00:0b 3975544 /dev/fusion/0
80000000-80004000 r-xp 00000000 00:0b 3917515 /usr/bin/dfbinput
80004000-80006000 rw-p 00002000 00:0b 3917515 /usr/bin/dfbinput
80006000-8001e000 r-xp 00000000 00:0b 3883884 /lib/ld-2.5.so
8001e000-80020000 rw-p 00016000 00:0b 3883884 /lib/ld-2.5.so
80020000-80022000 rw-p 80020000 00:00 0
80024000-800fe000 r-xp 00000000 00:0b 3917529 /usr/lib/libdirectfb-1.4.so.0.0.0
800fe000-80108000 rw-p 000d8000 00:0b 3917529 /usr/lib/libdirectfb-1.4.so.0.0.0
80108000-8010a000 rw-p 80108000 00:00 0
8010a000-8012c000 r-xp 00000000 00:0b 3917532 /usr/lib/libfusion-1.4.so.0.0.0
8012c000-80130000 rw-p 00020000 00:0b 3917532 /usr/lib/libfusion-1.4.so.0.0.0
80130000-80152000 r-xp 00000000 00:0b 3917526 /usr/lib/libdirect-1.4.so.0.0.0
80152000-80156000 rw-p 00020000 00:0b 3917526 /usr/lib/libdirect-1.4.so.0.0.0
80156000-80158000 rw-p 80156000 00:00 0
80158000-8015a000 r-xp 00000000 00:0b 3884005 /lib/libdl-2.5.so
8015a000-8015c000 rw-p 00000000 00:0b 3884005 /lib/libdl-2.5.so
8015c000-8016a000 r-xp 00000000 00:0b 3884029 /lib/libpthread-0.10.so
8016a000-8016e000 rw-p 0000c000 00:0b 3884029 /lib/libpthread-0.10.so
8016e000-801b0000 rw-p 8016e000 00:00 0
801b0000-802ac000 r-xp 00000000 00:0b 3883946 /lib/libc-2.5.so
802ac000-802b4000 rw-p 000fa000 00:0b 3883946 /lib/libc-2.5.so
802b4000-802b6000 rw-p 802b4000 00:00 0
802b6000-802c8000 r-xp 00000000 00:0b 3917925 /usr/lib/libz.so.1.2.3
802c8000-802ca000 rw-p 00010000 00:0b 3917925 /usr/lib/libz.so.1.2.3
802ca000-803ca000 rw-p 802ca000 00:00 0
803ca000-803dc000 r-xp 00000000 00:0b 3925656 /usr/lib/directfb-1.4-0/systems/libdirectfb_fbdev.so
803dc000-803e0000 rw-p 00010000 00:0b 3925656 /usr/lib/directfb-1.4-0/systems/libdirectfb_fbdev.so
bf8ee000-bf918000 rw-p bffd6000 00:00 0 [stack]
root@freescale:~#
Based on that, it looks to me like it should be working OK (i.e. - it doesn't look like virtual address 0x20000000 is conflicting with anything)... I suspect that virtual address 0x20000000 may "just not work" for some reason on the Coldfire's MMU... I've tried a few other addresses (without really understanding what I was doing) such as 0xA0000000, 0xC0000000, 0x40000000... If I remember correctly, the mmap() call returned "failure" in all those cases (so no segfault since Fusion didn't try to access the mapping)...
Thanks again for your help!
Allen
/************* Niels Roest's response # 2 ****************/
Hi Allen.
your "maps" looks exactly like mine, except that my node is called /dev/fusion0, and my shared size is 0x1000 instead of 0x2000 in your case. This is just PC, in my case.
I am starting to think that our implementation of the mmap is not working in your case.
It might be worth a try to see if you can get a closer look at fusion_mmap in fusiondev.c (fusion kernel module); put some printk's in there, you can always compare with your PC (and use e.g. X11 as directfb-system), which should work I hope.
Alternatively, you can try to map it to 0x9000.0000, which _should_ be working..
Not sure what else
I leave it to you to post it to a forum, though the Freescale guys might be the more logical approach here.
Would be interested in the solution.
hth
Niels
/************* My "follow-up" response # 2 ****************/
Thanks Niels!
I'm a bit confused as to why your character device node would be at /dev/fusion0 rather than /dev/fusion/0 ... Is that because you're working with "development code" and /dev/fusion0 is going to be the new convention?...
I understand that you see 0x1000 for the shared memory size as that's the PAGE_SIZE for x86 and most other architectures, while I believe ColdFire (m68k) has a PAGE_SIZE = 0x2000 (for reasons I don't understand yet)...
I did try changing the fixed address for the mmap() call to 0x90000000... But I still get a segfault attempting to access address 0x90000004...
So, yes, I believe I will soon be adding many printk's to the linux-fusion kernel module (and to the kernel) to increase my understanding of the code involved... And hopefully find a solution...
Let me know if you think of anything else I should be trying...
Thanks!
Allen
/************* Niels Roest's response # 3 ****************/
Hi Allen,
no sorry - our resident fusion guru is being ill at home.
I would do as you suggest - going the printk road.
/dev/fusion0 was renamed /dev/fusion/0 or the other way round, I wasn't around at that time.
The major number was also changed at that time, but I do not think you have such conflict since the "enter" ioctl has already completed successfully with sane output at that stage.
Greets
Niels
This was resolved with a patch but I don't recall the name of the patch.
You can find the resolution if you read the thread titled "MCF5485EVB, Linux 2.6.25 kernel: MMap broken..." started on 11/17/2008. Nabendu posts the changes to cf_pgtable.h to fix this. In my patch I don't think I added the CF_PAGE_DIRTY flag (but it's been a while..)
Thanks for your prompt response!
I probably should have mentioned that I am familiar with the forum thread and have already implemented the patch you mentioned... I work directly (although remotely) with jkimble and this was the first thing we both thought of when I saw this issue several days ago...
Just now I rechecked and rebuilt my kernel to make sure it contained Nabendu's change to "cf_pgtable.h"... and it does... and that patch does not help or change the issue I am seeing...
I'm guessing that you should be able to reproduce the issue using (mostly) the released Coldfire M547X_8X BSP and the "default" packages from LTIB's GPP... Just enable and build the DirectFB from LTIB (version 1.1.0)... Run "dfbinfo" on your target to see that DirectFB works in "Single Application Core" mode...
Then edit the .spec file for DirectFB to add "--enable-multi" to the ./configure line and rebuild the package... At this point you need to get the kernel module "linux-fusion" from here:
http://www.directfb.org/downloads/Core/linux-fusion/linux-fusion-7.0.1.tar.gz
Version 7.0.1 of the linux-fusion kernel module is the version that matches the DirectFB internal API for DirectFB v1.1.0 ... I just extracted the tarball directly into my kernel source tree in the directory ltib-xxxxx/rpm/BUILD/linux-2.6.25/drivers/char/fusion... and modified the ltib-xxxxx/rpm/BUILD/linux-2.6.25/drivers/char/KConfig file so it would build "fusion.ko" along with the kernel... Then I let LTIB rebuild the kernel...
Then you will need to create a device node on your target like this:
mknod -c 250 0 /dev/fusion/0
And load the module like this:
modprobe fusion
And then try to run "dfbinfo" again (now in Multi Application Core mode) and I believe you will see the issue...
I am trying to "cross post" these questions on the "directfb-users" mailing list but so far I have not even been able to join the list (i.e. - no confirmation e-mail to my join request yet)...
I still don't understand the significance of the Fusion library's call to mmap() with a fixed address of 0x20000000 + "some offset" (see the code segment from "fusion.c" in my original post)... I'm more familiar with calling mmap() with NULL as the first argument to let the kernel return whatever virtual address it wants to give the process...
I wonder if there is some "quirk" in the Coldfire kernel implementation of memory management that is conflicting with assumptions made in the DirectFB fusion or linux-fusion kernel module code...
This level and area of kernel code is really unfamiliar and cryptic to me so... Thanks for any help or ideas anyone can give!
Allen