: (The value of the fault bits is 0b0000 and the reference manual of the device marks this as 'Reserved'.)
Look closely at the paragraph before "Table 3-7. Fault Status Encodings". It says:
There is a 4-bit fault status field, FS[3:0], at the top of the system stack.
This field is defined for access and address errors only and written as
zeros for all other exceptions.
So for an illegal instruction it should be zero, like you're seeing. So you CAN trust the stack frame.
In fact, try to decode further back on the stack frame than you're doing. You may see a consistent pattern that previous stack frames are the same, and it is only the "last frame" that seems to be random. That may help you zero in on the real cause. Examine all of the CPU registers. The Address Registers should contain pointers to the data structures that were last being accessed, and that might tell you which functions it was in.
> the exception occurs sometimes points to a FLASH address and
> sometimes to a RAM address (very inconsistent).
If you're getting illegal instructions from RAM then ... I was going to say it is unlikely to be that Errata item as that should only affect FLASH fetches, but then I realised you're probably not executing from RAM and the CPU is probably going "off the rails" long before the access that caused the trap you're seeing. A CPU can "bounce around" randomly for a long time before it does something truly illegal and finally stops.
Before suspecting the CPU, it might be a software bug. The usual cause is stack corruption causing the CPU to return somewhere stupid when it does an interrupt or function return.
Workaround One is easy to implement. I'd do that first. You only have to set FLASHBAR[6] to "1". You may even be able to set this from the debugger. If you set that and the problem goes away, you've proved what the problem is. If the problem remains, start looking for software bugs corrupting the stack.
Tom