Interrupts can completely stall main loop even when interrupt load is only 50%

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Interrupts can completely stall main loop even when interrupt load is only 50%

1,662 Views
simmania
Contributor IV

I had a problem that did cost me weeks to find out what is going on. I would like to share my findings, as developers may encounter this issue.

At the end of this writing you will find the listing of a very simple program that I used to show the issue. I'm using the MIMXRT1010-EVK board with the MCUXpresso IDE for development.

The code includes an interrupt routine that is called by a PIT timer at 333KHz rate. Yes that is an unusually high rate. But it is needed to easily show the problem.
The execution time of the interrupt routine (interrupt load) depends on a global variable that will be set by the main loop.
When the USER_BUTTON (SW4) on the board is pressed, the interrupt routine will switch to the default minimal load.
During this interrupt the IRQPIN is set high so that we can see on an oscilloscope when the interrupt routine is executed.

The main loop does the following:

loop:
- wait for key press (digit 1 to 9)
- pulse the MAINPIN for oscilloscope trigger
- increase the interrupt load (by writing the global variable) according the pressed digit
- flashes the LED a few times and toggle the MAINPIN at a high rate during this
- clearing the global variable resulting in minimal interrupt load

I connected an oscilloscope to display the status of the IRQPIN (yellow top trace) and MAINPIN (blue bottom trace). The oscilloscope triggers on the blue bottom trace and the trigger point is at the 4th division from the left.

I attached a picture including some photos from the oscilloscope.

Photo 1 shows what we would expect:
First the load of the interrupt routine is low. The yellow trace is only high for a short time for each interrupt call.
Then when we press the '3' key. The blue trace shows the trigger pulse and immediately after that starts to toggle.
The load of the interrupt routine is increased to about 50%. So there is plenty of time for the main loop.
Off course during the interrupts (yellow trace high) there is no toggling of the blue trace. All perfectly as expected.

But the first time a key is pressed after board reset, things are a bit different. See photo 2.
If the key is pressed, we see the initial pulse on the blue line again. Then it takes some time before the interrupt load is increased and the blue trace starts to toggle.
I'm pretty sure this is because the code for the main loop is not yet in the cache and some external memory needs to be read first.
Not a problem, just a small delay in the main loop. The next time the key is pressed it will be as shown in photo 1.

But sometimes the main loop does not continue after key press! See photo 3. After the key press, we see the initial trigger pulse on the blue trace and the interrupt load increases. But then there is no activity on the main loop anymore! It stalls!
If we now press the USER_BUTTON, the interrupt load switches to minimal and the main loop starts execution again (the blue trace is toggling). This is shown on photo 4.

So what is going on here? This is my speculation:
After the main loop increased the interrupt load it needs to read some external memory because the next code to execute is not in the cache. But before this read is finished, a new interrupt occurs and this probably aborts the read of the external memory.
After the interrupt the main loop want to execute again and starts the read of external memory again. But again, before that is finished a new interrupt occurs and aborts this external memory read.
Maybe a NXP expert can comment on this if this is indeed what is going on.

Take away:
I think the important take away is that when you have code that is executed from external memory, you should be sure there is (often enough) some time between interrupts so the the external memory can be read. In my situations I need more than 1000 CPU clocks between interrupts for the main loop to be able to continue.

Notes:
- using higher interrupt loads (pressing higher number keys) makes the stalling more likely
- the effect can change (and even disappear?) when changing the code. Even at places that normally would have no effect. I think this is because it changes where the main loop is positioned in the cache lines.

I would like to know from a NXP expert if these findings are correct and if there is a good way to deal with this. Off course we could put the main loop in internal ram. But that would not always be possible because of the limited size of it. I would also like to know what the minimum time between interrupts would need to be for a reliable and robust execution.

#include "fsl_debug_console.h"
#include "pin_mux.h"
#include "clock_config.h"
#include "board.h"
#include "fsl_pit.h"
#include <cr_section_macros.h>

#define DEMO_PIT_BASEADDR PIT
#define DEMO_PIT_CHANNEL  kPIT_Chnl_0
#define PIT_LED_HANDLER   PIT_IRQHandler
#define PIT_IRQ_ID        PIT_IRQn
/* Get source clock for PIT driver */
#define PIT_SOURCE_CLOCK CLOCK_GetFreq(kCLOCK_OscClk)
#define LED_INIT()       USER_LED_INIT(LOGIC_LED_OFF)

volatile uint32_t g_delay = 0;
volatile uint32_t g_testVal = 0;

int main(void)
{
    pit_config_t pitConfig;

    BOARD_ConfigMPU();
    BOARD_InitBootPins();
    BOARD_InitBootClocks();
    BOARD_InitDebugConsole();

    CLOCK_EnableClock(kCLOCK_Gpio1);
    CLOCK_SetMux(kCLOCK_PerclkMux, 1U);
    CLOCK_SetDiv(kCLOCK_PerclkDiv, 0U);
    LED_INIT();

    PIT_GetDefaultConfig(&pitConfig);
    PIT_Init(DEMO_PIT_BASEADDR, &pitConfig);
    PIT_SetTimerPeriod(DEMO_PIT_BASEADDR, DEMO_PIT_CHANNEL, USEC_TO_COUNT(3, PIT_SOURCE_CLOCK));
    PIT_EnableInterrupts(DEMO_PIT_BASEADDR, DEMO_PIT_CHANNEL, kPIT_TimerInterruptEnable);

    EnableIRQ(PIT_IRQ_ID);

    /* more or less '*' chars may influence the behaviour */
    PRINTF("\r\n\r\n*************************************************\n\r");

    /* Start PIT timer */
    PIT_StartTimer(DEMO_PIT_BASEADDR, DEMO_PIT_CHANNEL);

    uint32_t dummy;

	while(true)
	{
		char ch;
		PRINTF("Hit key 1..9: ");
		ch = GETCHAR();
		PUTCHAR(ch);
		PUTCHAR('\n');
		PUTCHAR('\r');

		switch (ch) {
			case '1':
			case '2':
			case '3':
			case '4':
			case '5':
			case '6':
			case '7':
			case '8':
			case '9':
				{
					// generate a first pulse on the MAINPIN to trigger the scope
					GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 1U);
					GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 0U);
					// set the interrupt routine load according pressed key
					g_delay = (ch-'0')*7;
					// togle LED 5 times and also toggle the MAINPIN pin at a high rate
					for (int i=0 ; i<5 ; i++) {
						GPIO_PinWrite(BOARD_USER_LED_GPIO, BOARD_USER_LED_GPIO_PIN, 1U);
						for (int i=0 ; i<200000 ; i++) {
							GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 1U);
							dummy++;
							dummy++;
							GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 0U);
							dummy++;
							dummy++;
						}
						GPIO_PinWrite(BOARD_USER_LED_GPIO, BOARD_USER_LED_GPIO_PIN, 0U);
						for (int i=0 ; i<200000 ; i++) {
							GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 1U);
							dummy++;
							dummy++;
							GPIO_PinWrite(BOARD_INITPINS_MAINPIN_GPIO, BOARD_INITPINS_MAINPIN_GPIO_PIN, 0U);
							dummy++;
							dummy++;
						}
					}
					g_delay = 0;
				}
				break;
		}
	}

}

__RAMFUNC(RAM2) void PIT_LED_HANDLER(void)
{
	GPIO_PinWrite(BOARD_INITPINS_IRQPIN_GPIO, BOARD_INITPINS_IRQPIN_GPIO_PIN, 1U);

    /* Clear interrupt flag.*/
    PIT_ClearStatusFlags(DEMO_PIT_BASEADDR, DEMO_PIT_CHANNEL, kPIT_TimerFlag);

    /* dummy load if requested */
    if (GPIO_PinRead(BOARD_INITPINS_USER_BUTTON_GPIO, BOARD_INITPINS_USER_BUTTON_GPIO_PIN)) {
    	if (g_delay) {
			for (int i=0 ; i<g_delay ; i++) g_testVal++;
    	}
    }

    SDK_ISR_EXIT_BARRIER;

    GPIO_PinWrite(BOARD_INITPINS_IRQPIN_GPIO, BOARD_INITPINS_IRQPIN_GPIO_PIN, 0U);
}

 

0 Kudos
Reply
3 Replies

1,564 Views
diego_charles
NXP TechSupport
NXP TechSupport

Hi @simmania 

Thank you for reaching out. 

This topic is interesting and I would need some time to reproduce it on my bench and seei if I can confirm your observations.

Diego

0 Kudos
Reply

1,568 Views
diego_charles
NXP TechSupport
NXP TechSupport

Hi @simmania 

Thank you for reaching out. 

This topic is interesting and I would need some time to reproduce it on my bench and seei if I can confirm your observations.

Diego

0 Kudos
Reply

1,566 Views
simmania
Contributor IV

I am really looking forward to your findings!