That's a fuzzy topic, so I think you need to try different things to find a solution.
Switching the clock to 12MHz IRC source:
// Switch BASE_M4_CLOCK from whatever clock to IRC
LPC_CGU->BASE_M4_CLK = (0x01 << 11) | (0x01 << 24) ; // Autoblock Enable, Set clock source to IRC
LPC_CGU->PLL1_CTRL |= 1; // Disable PLL1
LPC_CGU->PLL0USB_CTRL |= 1;
LPC_CGU->PLL0AUDIO_CTRL |= 1;
// Further preparations
LPC_CREG->FLASHCFGA |= 0x0000F000; // Set maximum wait states for internal flash
LPC_CREG->FLASHCFGB |= 0x0000F000;
This can be done on the fly, the switchover point between the high frequency and the 12MHz is controlled internally. Doing this under debugger control may not work (as you can see).
What's really interesting is this point with delay = 300. I can't explain this. In the past I have seen strange effects from the compilers, sometimes with high optimization level O3 things were completely optimized away.
It's also very important where the code resides, when it's executed. A loop is sometimes so short that it completely fits into the buffer of the flash accelerator, you can call it cache. Sometimes I have surrounded such a delay by GPIO on/off commands to see with an oscilloscope how long the loop really is.
You should do a test without code optimization and maybe also with a different delay implementation:
for (d = 0; d < delay; d++);
{
__NOP(); // that's for ARM CC, this can't be optimized away
__NOP();
}
Before you write the reset bit, let the M4 core and the bus system finish everything:
__ISB();
__DSB();
// Write reset bit
For MCUXpresso with the GCC compiler the instructions look this way:
__asm volatile ("nop\n");
__asm volatile ("isb\n");
__asm volatile ("dsb\n");
If you perform the chip reset from the M0APP core, then you should stall the M4, maybe like this:
1) Issue an interrupt to the Cortex-M4 side
2) In the ISR on the M4 side perform the ISB() and the DSB() and then go into a while(1);
3) Wait long enough on the M0 side to let the M4 do this and then write on the CORE_RST reset register bit
In general you should carefully think about the things which can happen to the other side if you perform a system kill action from one core. The two cores execute asynchronously, but using the same bus resources. In principle it would be better to delegate this chip reset task to the M4 core, because you can keep the M0APP in reset state while you're working on the wind back and the CORE_RST.
Regards,
Bernhard.