Hi everyone,
I eventually figured out the problem. It was a truly nasty one!
Basically the processor got a double fault (an fault-on-fault as the freescale manual calls it).
In the "resume" function when the context of the task the processor is switching to is recalled, the processor gets an access exception because the page containing the stack has been evicted from TLB some time earlier. But, in order to process the exception, the stack must be accessible to save the return address and the exception data. And so the CPU enters a fault-on-fault condition from which reset only can force it out.
It looks like Freescale people already encountered this problem when they did the port of 2.6.10 kernel to coldfire. Their solution was to allocate some locked lines in TLB covering a portion of RAM (8 Mb) and to allocate all thread stacks in this part. The trick was to declare this part of memory as a DMA zone and to use the GFP_DMA flag when the stack for new threads was allocated (by alloc_thread_info function).
The port of 2.6.23 kernel to MCF54455 used a simpler version of this solution: there they allocate 256 Mb of RAM in locked TLB lines. In this way they cover any reasonable amount of RAM that the processor is ever going to use and they don't need to specify GFP_DMA when allocating stack memory. Unfortunately, this solution is not applicable to MCF5475 because the maximum size of TLB lines is 1 Mb (as opposed to 16 Mb in MCF54455). With 1Mb lines you can't cover more than 32 Mb of RAM (if you are willing to use the whole TLB) and the EVB board carries 64 Mb.
Regards,
Federico Ulivi
Embedded software developer
Sky-Technology srl
V. Gonin, 55
20147 Milano - Italia
www.skytechnology.it