lpc1769: long time to return from iinterrupt

lpcware · ‎06-15-2016

Content originally posted in LPCWare by PhysicsGuy on Fri Mar 01 13:38:34 MST 2013
Dear forum users,

this is my first post here.

I have a little problem with the handling of Timer0 handling on an lpcxpresso board with an lpc1769 on it.
My interrupt handler does the following. I set the timer such that it counts down from a certain value and raises an interrupt whem it reaches zero. In the handler, I reset the interrupt, load a new value into the timer and do some other stuff. In one application, this other stuff is just loading a word from memory and outputting that word on the pins of gpio0. This works nicely and if I run -O3 optimized code I can achieve a minimum time between events of 1 microsecond. Which is just about good enough. I can also somehow understand it, because it is twice the time ot seems the lpc to switch from main code to ithe isr. Does anyone know by now where this 500 ns latency comes from? I have followed several threads about this and no one seems to have resolved this issue.

In another application, instead of just clocking out one value, I program a 16 bit dac through gpio0. This takes, allowing for the time delays required by the part, about one microsecond, as measured by monitoring a gpio pin going upon entering the isr and going low at the end of the isr. This works when I set the timer value such that the time between the timer interrupt is about 4 microseconds or longer this works ok, but if I set the timer such that the period is less that ~2.6 microseconds, the period becomes unstable and fluctuates around 2.6 microseconds. I would have guessed that this should be 2 microsecond (the minimum period in my digital example+the time it takes to do the extra work to program the dac).So where is theextra 600 ns coming from?

To specify the system further, I have no other irq's enabled. The main loop polls the ethernet mac for incoming data, which is not there, because the part is in a separate lan without anything sending on it (no IP over this lan, only ethernetpackets under my control).

Thanks in advance for any hints.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by PhysicsGuy on Wed Mar 06 05:47:06 MST 2013
Hi guys,

thanks for all your comments, they have been very helpful.
I have found some ways to shave of some time in the handler.
However, the several 100 ns that I was missing actually turn
out to be a different problem. The schematic I was given and
the actual board I was measuring on (which contains the
lpcxpresso as piggy board) do not match, so I was actually
measuring on the wrong pin. Duh!
Let's just say it was good that it happened because I learned something. So thanks again for all your advice!

lpcware · ‎06-15-2016

Content originally posted in LPCWare by micrio on Sun Mar 03 10:05:24 MST 2013
You both are right but there is a subtle distinction here.

If your interrupt handler can fit all its local variables in the few registers
that are automatically saved, R0-R3, R12, LR,PC and PSR, then the
interrupt's overhead is minimized.   The system will automatically
push those registers, there is no way to minimize that.   This interrupt
handler will execute at maximum speed and have the minimum
entry and exit overhead.

If you use more local variables but still within the limit imposed by the
number of registers then interrupt handler will execute at maximium speed.
However, there will be more entry and exit overhead because these
additional registers must be protected by pushing them.   This interrupt
handler will execute at maximum speed but will have less than optimum
entry and exit overhead.

If you use more local variables than will fit in the registers then the
routine will execute slower.   Access to data held in registers is faster
than data held on the stack or RAM.   These variables on the stack
do not increase the entry and exit overhead.   This interrupt handler
will execute more slowly and will have less than optimum
entry and exit overhead.

The fastest interrupt handler will attempt to keep all local variables in
the few automatically pushed registers.   Otherwise there is a performance
price to pay.

Pete.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by frame on Sun Mar 03 08:25:23 MST 2013

Quote:
Apparently a certain amount of registers are pushed automatically, but if my handler
(which I wrote in C) uses more than those, it could be that it is pushing more stuff. This could be happening,
because I'm declaring a couple of variables in the handler, that the compiler could put registers.
This could be the cause of the additional delay.

No.
An interrupt happens asynchronuously, and interrupts the main code. [B]That[/B] context is saved.
If you are interested in those details, fetch the appropriate documents from the ARM infocenter webpage,
especially concerning the ARM ABI. This documents also explains the parameter passing in registers and on the stack.

Variables declared in interrupt routines do not get pushed/popped upon interrupt entry or exit.
As an interrupt handler does (by definition) not take any parameters, nothing enters or leaves the handler this way.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by cfb on Sat Mar 02 17:17:38 MST 2013

Quote: PhysicsGuy
This could be happening, because I'm declaring a couple of variables in the handler, that the compiler could put registers. This could be the cause of the additional delay.

I have just been working on exactly those sorts of optimisations in our Oberon compiler. In our case the best results are achieved when there are no local variables and the interrupt handler is a leaf procedure (i.e. doesn't call any other procedures itself). The C compiler uses registers differently. Look at the assembler code that is generated by your application to see exactly what is happening. R0-R3, R12, LR,PC and PSR are automatically saved and restored - you can't do anything to stop that from happening. If R4-R11 are used they will be saved as well. If you can avoid their use you should see some improvement.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by PhysicsGuy on Sat Mar 02 16:52:40 MST 2013
Thanks guys, for the quick response. I skimmed through the book you mentioned and found the bit about how the low latency is achieved in the Cortex-M3 core. Apparently a certain amount of
registers are pushed automatically, but if my handler (which I wrote in C) uses more than those, it could be that it is pushing more stuff. This could be happening, because I'm declaring a couple of variables in the handler, that the compiler could put registers. This could be the cause of the additional delay.

When I'm back behind my desk, I will try to figure out what the compiler actually makes out of my handler.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by frame on Sat Mar 02 08:56:43 MST 2013
The Cortex M interrupt latency is 12 cycles at minimum. The only exception is the M4F,
when stacking of floating point registers is enabled.

Possibly you could have a case of an auto-tailchaining interrupt.
The general recommendation is [B]not[/B] to reset the pending flag as [B]last[/B] thing in the interrupt routine,
but do this earlier. Otherwise this instruction might not yet have made it through the pipline upon
interrupt exit, causing it to fire again.

lpcware · ‎06-15-2016

Content originally posted in LPCWare by cfb on Fri Mar 01 15:48:31 MST 2013
If you want to get a good understanding of what is going on I recommend that you read Joseph Yiu's book "The Definitive Guide to the ARM-Cortex-M3". It includes three chapters (~45 pages) where Exceptions, the NVIC and Interrupt behaviour are discussed. The section which describes some of the issues related to interrupt latency is the best part of a page long.