MC68332 TPU interrupts

VCP · ‎05-07-2023

a) Following TPU module configuration is done in project

*M332_TPU_TMCR = 0x1e4c; /* TCR1 time base is selected = 250nsec,
TCR2 = 4usec, emulation mode */
*M332_TPU_TICR = 0x0370;/* Interrupt level is 3,interrupt vector-70, TPU base
vector */
*M332_TPU_CIER = 0x0;
*M332_TPU_CFSR0 = 0xeeee; /* Configure TPU Channels 12-15 as QOM */
*M332_TPU_CFSR1 = 0xeee0; /* Configure TPU Channels 9-11 as QOM */
*M332_TPU_CFSR2 = 0x0;
*M332_TPU_CFSR3 = 0x0;
*M332_TPU_HSQR0 = 0x0;
*M332_TPU_HSQR1 = 0x0;
*M332_TPU_HSRR0 = 0x5554; /* QOM output level zero */
*M332_TPU_HSRR1 = 0x0;
*M332_TPU_CPR0 = 0x0;
*M332_TPU_CPR1 = 0x0;
*M332_TPU_CISR = 0x0;

b) The status register is configured as below

"MOVE.W #0x2100,SR"

c) QOM FUNCTIONALITY 'e' selected for CHANNEL 9, 10 and 12

d) Interrupts enabled for CHANNEL 9, 10 and 12 as below

M332_TPU_CIER := "or"(M332_TPU_CIER,16#1600#);

Code stops execution after few minutes when TPU interrupts are enabled and ISRs are executed.

Can anybody please verify and confirm whether the above TPU configuration is correct?

VCP · ‎05-16-2023

1. We have assigned unique IARB values for only SIM, QSM and TPU modules.

2. Make sure that your code has "locks" between all variables shared between the ISRs and the mainline code. what does "locks" mean?

3. We have enabled interrupts for 5 TPU channels and assigned same priority levels to all 5 channels.

3. We are running code from external flash memory and we see that code execution stops by observing TPU channels suddenly stops driving in QOM functionality mode.

4. We have written exception handler routine for all CPU exceptions but observed it has not entered any of these routines.

5. We have not used any hardware/software watch dog

TomE · ‎05-16-2023

> what does "locks" mean?

https://en.wikipedia.org/wiki/Race_condition#Example

You will have variables and structures shared between the mainline code and the interrupt service routines. There has to be some form of mutex or "lock" to protect accesses. For a simple OS on a uniprocessor the normal way is to disable all interrupts around accesses in the mainline.

The Coldfire and M68K are a little more complicated as they have multiple interrupt levels in the CPU (and not just in an interrupt controller). "Best practice" here is to have one function that disables interrupts and returns the current level, and a second one that restores the level. Your compiler library should provide a function like this, or you should write one yourself. Something like "asm_set_ipl()" being used here in our FlexCAN code:

    #define CAN_IPL 5
    uint16_t old_ipl = asm_set_ipl( CAN_IPL );
    // Done the way the manual says
    p->regs->msg_buf[RR_RESP_OBJ].ctrl_flags_len = 0x0800; // tx idle
    flexcan_id_to_devid( &p->regs->msg_buf[RR_RESP_OBJ], p_msg->id );
    p->regs->msg_buf[RR_RESP_OBJ].d0123 = *(uint32_t*)&p_msg->d[0];
    p->regs->msg_buf[RR_RESP_OBJ].d4567 = *(uint32_t*)&p_msg->d[4];
    p->regs->msg_buf[RR_RESP_OBJ].ctrl_flags_len = 0x0A00 | p_msg->len;
    volatile int timer = p->regs->timer; // unlock
    asm_set_ipl( old_ipl );

In most places in our code the above would be "old_ipl = asm_set_ipl(7)" to completely disable interrupts, but in this case we're only disabling to the level that the CAN interrupts run at, allowing higher ones to run.

If none of your exception handlers are running, then the code is probably "stuck" somewhere in an infinite loop. I'd suggest you write a timer-based interrupt handler running at IPL7 (the one you can't mask out). Have that run periodically and have it print the PC where it was interrupted from. You can usually do this in C by the following, which works on the Coldfire (but probably needs a different offset for M68K with your compiler):

static void soft_wdt_callback( void )
{
    uint32_t stack[1];
    // The magic 9 gives the return address of the next frame up.
    DEBUG( 1, "Got soft WDT during %p", (void *)stack[9] );
}

Basically, hack up a soft watchdog that can tell you where the code is stuck. Or use a debug pod to do the same thing if you have one.

You should be able to find any "wait forever" loops in your code (waiting for something that should happen, but somehow hasn't) and then add some timer or counter based timeouts in them, so they can detect when this happens, tell you about it and recover from it.

Tom

VCP · ‎05-18-2023

Whether configuring 8 TPU channels to service request 'F' and assigning high priority levels to all 8 TPU channels (which are accumulating pulses in therange of 100microseconds to 200 microseconds) will lead to any issue in code execution

TomE · ‎05-24-2023

> Whether...

What are you saying there? It isn't a question. If you think the way you've configured it is causing problems, then CHANGE the way you've configured it and see if the problem changes or goes away.

Whatever you do, don't use CPU IPL 7. That will cause all sorts of problems.

Tom

VCP · ‎06-07-2023

Thank you for the suggestion. We have observed that after attaching exception handlers to bus error, address error and others, the control is entering the exception handler when code is executed so how can we get information about which part of the code caused this exception and also can we suppress all these exceptions

TomE · ‎06-08-2023

Which exception? I assume you wrote different exception handlers for each interrupt so you'd know which one failed? If not, do that first.

What "debug mechanism" are you using to know that you have got an exception? Are you printing, flashing LEDs, putting something on a screen or what? I can't advise you of exactly what to do next as I don't know what tools and ports you have available. Please detail these. I'll assume you at least have a serial port you can print messages to. If you don't have one of those, BUILT one. Get someone to make a connection to a spare serial port and use that for debugging.

What are you using to develop this code? Are you using CodeWarrior, a GNU compiler set or something different? How are you loading the code into the device?

If you're using CodeWarrior with a debug pod, then it should show you exactly where your CPU is, why it failed and how it got there. IDEs (Integrated Development Environments) have been doing this for decades. If you're using one of these you should learn how to use it for this.

I'm "old school", not by choice, but by the choices of the companies I've worked for. That means compiling the code with a GNU tool chain and compiler, generating ELF then maybe HEX files, loading that into the thing I'm programming, then maybe using a debug pod and GDB to test and debug it. Many times I have to use print statements.

The whole point of exceptions is to let you write code so the device can do something sensible with the error. Maybe recover from it, maybe report it, maybe log it. Desktop computers get exceptions all the time. If you have a program on a desktop (or phone) that crashes, that causes an exception, the operating system catches that, cleans up, tells you about the problem then keeps working normally. Embedded systems usually don't do this, but they should do something with the exceptions to help with the problem you're currently having and ones like it.

So what to do when the CPU has had an exception? The designers of the CPU went to a lot of trouble to make this easy for us. The CPU saves details on what happened (on the stack), and they documented this. For this CPU you should find and read "M68000PM/AD Rev .1 Programmers Reference Manual (Includes CPU32 Instructions)". Search NXP for "M68000PM". You should also have "MC68332UM.pdf". The latter one is the obvious place to start, but that doesn't detail the exception frames. For that you're meant to be reading Section 6 in the CPU32 manual "CPU32RM.pdf".

This stuff dates from 1995, so you can't expect to find Youtube Videos, or for ChatGPT to tell you how to do this :-).

OK, so assuming you have a way to print to a serial port or something once your code has hit one of these exceptions, you want to "dump the stack frames". In the exception handler (that you're writing), print addresses and data (16 bit words) in hex from the current value of the stack pointer all the way back to where the stack was originally initialised. Also dump all the registers (8 data and 8 address) as they may contain pointers that are wrong (null pointer errors and so on).

With practice you should be able to "read that by eye": and recognise addresses for data and for the functions that the code was running. You need a MAP file from the compiler (giving function and data addresses) or preferably a complete annotated dump of the program. For Gnu that's "objdump -s", but you'll have to find the equivalent for what you're using.

First you find the "Format and Vector Offset" word in the stack dump, and then you decode the exception frame using the format code and the details in "M68000PM". That will tell you what the instruction was that blew up, and maybe how. That might be enough to show the mistake.

Then you can decode the rest of the calling stack. The format depends on your tool chain's "ABI". Different ones put different things (link registers and so on) in there. You should be able to recognise code addresses in there to find how the code got to where it failed.

The worst that can happen is that the stack pointer got corrupted and isn't pointing to where the stack was. You should write code to recognise that and tell you that's what happened. You can't "dump the stack from the stack pointer" if that happened, but you might be able to dump where the stack was meant to be.

You probably shouldn't call any existing "print" routines to dump the stack as the system may not be stable enough for that. Especially if you have interrupt driven serial port code, you can't reliably call that from an exception handler. You may have to write a small custom "dump memory in hex to the serial port a byte at a time" function for that. That's beginner-level programming, but Google can probably find you an example to save some time.

Have you read the "Engineering Bulletins" here, specifically "Generating Interrupts on the TPU" and the "Using Bus Error Stack Frames..." ones? The latter one shows how to read and understand Bus Error Exception Frames in order to find program faults. You should also read the Errata documents.

https://www.nxp.com/products/processors-and-microcontrollers/legacy-mpu-mcus/32-bit-coldfire-mcus-mp...

Tom

TomE · ‎05-12-2023

This looks to be more of a problem with the interrupt service routines rather than something simple with the TPU setup. Make sure you have unique IARB values. Getting that wrong can cause intermittent failures. Make sure that your code has "locks" between all variables shared between the ISRs and the mainline code. Make sure your code can handle interrupts interrupting other interrupts (if you're using different levels).

What sort of "stops execution" is happening? Do you have a debug pod that can trap exceptions and see where the code is? Do you have all CPU exceptions trapped, and indicating somehow that they have happened? Do you have a hardware or software watchdog?

Tom