>MCF5216
>512k of FLASH from 0x00000000 to 0x0008000. So the problem is happening at the boundary between the first and second 256k parts. That must mean something.
So it seems, however this is not always the case, eg sin() is called and it resides in the upper 256k and executes okay. Also if the instruction is executed in single step mode it runs fine ie run to problem, single step instruction, then execute at full speed. works fine.
> bloat.c
>It should make testing easier if you put the upper half of your test code in a "named section" (probably with a pragma of some sort) and then change the linker file to put that where you want it. You'll then be able to position that
>section at different addresses like 0x3FFFe, 0x40000, 0x40002, 0x40004 and so on to try and decode what the problem is.
Will try this.
> Adding calls at 75 bytes/call
>That has to be an even number. The ColdFire can't execute on odd boundaries. Function addresses are normally rounded up to the next 4-byte boundary, and sometimes to a 16-byte boundary.
Sorry, the function is 75hex (116) Bytes.
> Below is the ‘exception stack frame’ contents
> Don't decode it. Please capture the raw and complete stack frame (with the value of the stack pointer) on an instruction before the crash and then again after. There might be multiple stack frames from multiple exceptions, and the previous
> ones might give more information. Since the CPU is crashing and then entering the debugger, it may not be giving you the correct frame information as the entry into the debugger may not have been "clean".
Will do this, although I dont know how to achieve this as it crashes inside the lib function.
>Suggestions.
>First, you may have a faulty part with hard or soft errors in its FLASH. Try another part.
Tried parts from different batches. Same problem.
>Check that all of the Power Supplies and Grounds are connected, noise free, bypassed and at the proper voltage.
Will do this today
>Read the Errata and make sure you don't have a very old part with a datecode prior to XXX0327.
Already ruled this out. Date code on our devices is XXX1122
>Make sure your code isn't in violation of SECF005. There must be a "NOP" after all writes to CACR.
We don't use cash. Cash is off.
>Check the value of FLASHBAR against the manual.
From mcf5282_lo.s
Initialize FLASHBAR: locate internal Flash and validate it Initialize RAMBAR0: This is the FLASHBAR Leaving bit 6 in the FLASHBAR register cleared can cause corrupted fetches from the device's internal flash so set bit 6 here. see chip errata SEC004
| move.l | #(___FLASH_START + 0x161),d0 |
movec d0,RAMBAR0
From mcf5282_lo.s.list
0x00000010: 203c00000161 move.l #(___FLASH_START + 0x161),d0
0x00000016: 4e7b0c04 movec d0,RAMBAR0
From manual
0xC04 Flash Base Address Register 0
>The CPU seems to be reading a bad value from the FLASH when it crosses that boundary. It may have been programmed/written wrong, so you should dump the Flash and compare the function that fails with the HEX or BIN file.
This is done when the flash programmer performs a verify, does it not? (integral flash programmer supplied with CW 6.4)
>Fill the FLASH with data patterns (all zeros, all ones, alternating bits) and verify that it reads back OK. Your problem looks like it might be a "stuck bit" that only causes a problem when an instruction is in the right location to be >corrupted "just so".
Not done yet.
>Make sure the Programmer has been told how fast your CPU clock is. When programming the FLASH it has to set CFMCLKD up to generate a valid clock. If this is out of spec the FLASH may be programmed badly and may be >returning bad values as a result (it may be "half-programmed").
If this was the case why would it only effect code written to the upper 256k, Also I would not be able to single step the instruction without it crashing, however I will check it.
>If the FLASH looks OK when read from the debugger, write some code to read it to RAM and then compare with the expected values.
Not done yet
>If you're running with the Cache enabled, see if running with it off changes anything (single reads versus cache-line-fill burst-reads). Try invalidating the Cache before your test and see if it changes. Try all combinations of >CACR[DISD] and CACR[DISI]. Run with the Cache Off on the FLASH but test with CACR[CEIB] on and off. Ditto CACR[DBWE] just in case.
Not using Cashe
>Check the values written to CACR, ACR0 and ACR1. Are you mapping the Flash with one of the ACRs? Is it masking all or half of the Flash? Try changing the ACR to only map the upper or lower half and see if the symptoms >change.
We don't use cache
>Change the CPU speed. Make it faster and slower and see if that changes anything. Make sure you don't have any interrupts or DMA enabled as that will complicate things. Make sure the Watchdog is disabled.
Not done yet
>Assuming none of the above helped, I suspect the CPU is fetching a bad instruction and then jumping somewhere stupid. It then gets an exception. If there's no proper exception handler it will keep getting errors until if finally gets >a fatal error, or runs into something the debugger has trapped ,or bombed the stack. By that time it has had so many faults it has wiped out any information that may help you. So you want to stop it ASAP. So you should add >interrupt vectors for *ALL* interrupts (so that all 255 of them are covered). Point them to a small handler and put a breakpoint in there. That may stop it sooner. Fill the stack with a known patters so you can see where the CPU >has been.
I already trap all exception and have implemented just as described above. However initialising stack to a set pattern is a good tip thanks
>If all else fails, program a PIT or DMA timer to give an IPL7 interrupt. Then use it as a "timed breakpoint". Start it in your code that is "heading for the boundary" and try to set the timer to get an interrupt exactly on the first >instruction that it goes crazy on. The Stack Frame will show the PC where it was interrupted.
Will try this as a last it resort
Thanks for the advice. It will take some time for me to get round to all the suggestions. I'm going to start with a simple base project to see if I can replicate issue on a simpler app. and take it from there. I will update this post with my findings.
Dave