AnsweredAssumed Answered

Wrong interrupt handler being called after period of normal operation (MCF5235)

Question asked by sanhadrin on Mar 30, 2013
Latest reply on Apr 1, 2013 by TomE

Hi all,

 

I'm having an issue when using the CAN controller; after a period of normal operation, the program will throw an "illegal instruction" exception. After some troubleshooting, I was able to determine that it was trying to call an interrupt handler for interrupt source 63 on controller 0 (unused interrupt). Unfortunately, I'm somewhat at a loss to determine why this is happening. SWIACK0 is reporting the interrupt source as 36 (PIT0) from within the source 63 interrupt handler, but that has a handler that is vectored without issue up to that point. I checked the vector table, thinking that there was corruption of the function pointers causing an issue, but there's no corruption present. I'm not quite sure where to look next to troubleshoot the issue.

 

One key thing I noticed is that if the interrupt handler for the CAN interrupt sources (since handler handles all 18 sources) is set to level 7, the issue will never happen, but if set to level 6 at the highest priority, it will. This seems to point to an issue where CAN interrupts not being handled in a timely fashion is causing this behaviour, but I would like to be able to determine the source of the issue with the available information if at all possible.

 

Thanks for any help you can give. Below is an account posted at Stack Overflow that has unfortunately gone unanswered so far.

 

At a certain point in my C application (running bare to the metal, supervisor mode) when using the CAN controller via a third-party library, an Illegal Instruction fault was occurring, which is caught in an ISR; by that point, the program counter, fault, and return address in the exception stack frame available to the ISR were already 0. When I first encountered it, I was able to back up the stack a bit, and saw a stack trace like this:

Thread [1] <main> (Suspended : Step)

0x0

  0x41f42200

  ...

  timerInterrupt() at timer.c:1,175 0x2432ec

  0x41902210

  ...

  main() at main.c:1,433 0x211a44

I ran the application several times with a known state that could reproduce this issue quickly, usually down to the exact same stack trace/saved instruction when the interrupt/exception before the jump to 0x0. Through testing, I noticed that the jump would only happen on the instruction following interrupts being re-enabled after being disabled, or in a section of code where interrupts weren't masked. So, I figured that this must be a user interrupt causing the issue, though I wasn't sure why it would appear to try to call a handler that wasn't set when the interrupt wasn't enabled in the mask. I'm not 100% sure of the meaning of the addresses in the IPSBAR range that precede and ISR being called, but since they're the same for each call of that ISR, I figure I could use it to indicate the source of the last interrupt/exception.

So, I added a default interrupt handler to all interrupt vectors on interrupt controller 0 before the normal handlers were added and ran the application again - and lo and behold, a breakpoint set in the default handler was hit when that suspected interrupt was fired (eg, stack looked like this):

Thread [1] <main> (Suspended : Step)

__DefaultInterrupt() at interrupts.c 0x41f42200

...

timerInterrupt() at timer.c:1,175 0x2432ec

0x41902210

...

main() at main.c:1,433 0x211a44

Observing the value of SWIACK0 in that function, I saw that the interrupt source was 100 (user interrupt 36, PIT0 interrupt). Well, that already has an ISR (timerInterrupt() in the stack above). I next checked the area of RAM where ISR function pointers were saved to see if the timer interrupt handler function pointer was corrupted, but there was no change between the time all interrupt handlers were set, and when the breakpoint in the default handler was hit.

I also noticed that if I set the interrupt level of the interrupt handler for the CAN controller to 7 (the same interrupt handles all 18 FlexCAN interrupt sources), the issue doesn't occur. I'm not sure what to make of it just yet, but the issue does absolutely point to either the CAN library or controller being at issue.

EDIT - I wasn't sure at this point exactly which ISR was handling the interrupt, but I've added individual handlers to the initially suspected interrupt sources, and it's always interrupt source 63 - which is an unused interrupt, according to the documentation, and the last one on interrupt controller 0.

EDIT 2: It occurred to me that the active interrupt source in SWIACK0 is actually correct, but there might be another issue, like the vector base address might be getting rewritten. Unfortunately I'm not sure how to read it back as it's a write-only value. I initially thought that the interrupt source for PIT0 was in that register because the default interrupt handler was getting called from within the timer interrupt handler, but it's also indicated if the timer interrupt isn't in the stack. The reference manual indicates that the on-chip debug device can be used to read back control registers and therefore VBR, but I don't see any information in the debug manual to do this.

To make a rambling story short, I want to find out the source of the jump to hyperspace, or what information I can use to get it.

  • What's the meaning of the addresses in the IPSBAR range getting pushed onto the stack?
  • Since those addressed seem to be completely tied to their source, is there a way to use a value in the stack (eg, 0x41f42200 in the first example) to determine the source of this interrupt/exception that
    pushed it onto the stack?
  • Am I going about this completely wrong? I'm more than happy to
    abandon any and all of this line of thinking.

Thanks for any help or insight, and I'll update this with more (concise) information when I can rub two brain cells together to do it.

Outcomes