Callback Overhead in KL03 running at 48 Mhz

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Callback Overhead in KL03 running at 48 Mhz

Jump to solution
2,921 Views
filipdossche
Contributor III

Hi,

I have been trying to get an application to work which uses CMP0 to generate interrupt calls to the associated callback.

It is a bare metal application so no OS whatsoever. The only thing i am using is the generated code from the Processor expert to set up the CMP0 component and defining the call back function.

I need to be able to measure the timing between respective falling and rising edges with a 1μs resolution, the shortest period between 2 edges is +/- 6 μs. the longest +/- 100 μs.

The 1 μs resolution is trivial, I use a TPM module set up as a free running counter counting at 1 MHz and read the current value using the "TPM_DRV_CounterRead" function.

I have checked that it does effectively work, when I apply a 1 kHz signal to my signal input It duly measures 500 μs ( +/- 1 ) between each rising and falling edge of the signal.

However, when I try to measure the much faster signals I need the minimal measured difference is +/- 20 μs.

Other than determining the difference between two edges my Callback function does little else so I guess that is not the problem.

Therefore I think there is a pretty large amount of overhead in calling the CallBack function. Does anyone have an idea as to how much overhead there is on a KL03 running at 48 MHz ?

Any help very much appreciated.

Labels (1)
0 Kudos
Reply
1 Solution
2,626 Views
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi, Filip,

Regarding the void CheckTimeDifference( void ) function, it is dependent on the FTM driving clock setting and the assembly code which the NewCount = TPM0_CNT; generated. If you set up the CLKS as 01 and PS as 000 in FTM_SC register, the 48MHz system clock will drive both the core and the FTM counter. The C instruction NewCount = TPM0_CNT is not a single instruction in assembly, it compile to multiple assembly instruction, you can see the assembly code in debugger.

Hope it can help you explain "something is slowing down the processor".

BR

XiangJun Rong

View solution in original post

0 Kudos
Reply
7 Replies
2,626 Views
filipdossche
Contributor III

Hi,

Thanks for all the info. I'll write my own ISR function and it will probably help a lot.

While testing this I found another weird thing, I have simplified reading the TPM counter value just to make sure there is no overhead in that either.

I ended up with this bit of test code:

static uint32_t NewCount = 0, PrevCount = 0;

void CheckTimeDifference( void )

{

   NewCount = TPM0_CNT; // get the latest microsecond counter value

   PrevCount = NewCount;

   NewCount = TPM0_CNT;  // get the latest microsecond counter value

   NewCount = TPM0_CNT;  // get the latest microsecond counter value

   NewCount = TPM0_CNT;  // get the latest microsecond counter value

   NewCount = TPM0_CNT;  // get the latest microsecond counter value

   NewCount = TPM0_CNT;  // get the latest microsecond counter value

   PrevCount = NewCount;

}

When I put a breakpoint on the last line I end up with PrevCount holding the first counter value and NewCount holding the last value read.

the intermediate read operations are just there to evaluate the execution speed.

I know for sure that the core clock is at 48 MHz and the bus/flash clock is at 24 MHz.

I am equally sure that my TPM0 counter counts at a rate of 1 Microsecond, I have done plenty of testing to be 100% certain.

I would expect the few single cycle read and store operations (at 48/24 MHz) to take well below 1 microsecond to execute but strangely enough they always end up taking 3 microseconds.

So: I think something is slowing down the processor but I have no idea what it might be.

0 Kudos
Reply
2,627 Views
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi, Filip,

Regarding the void CheckTimeDifference( void ) function, it is dependent on the FTM driving clock setting and the assembly code which the NewCount = TPM0_CNT; generated. If you set up the CLKS as 01 and PS as 000 in FTM_SC register, the 48MHz system clock will drive both the core and the FTM counter. The C instruction NewCount = TPM0_CNT is not a single instruction in assembly, it compile to multiple assembly instruction, you can see the assembly code in debugger.

Hope it can help you explain "something is slowing down the processor".

BR

XiangJun Rong

0 Kudos
Reply
2,626 Views
egoodii
Senior Contributor III

Whenever you are trying to count instructions, it is necessary to look at the raw assembly result of such a routine.  Can you give us that?

0 Kudos
Reply
2,625 Views
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi, Filip,

You are using KL03, I have checked the website of KL03, it seems that Kl03 does not have DMA module.

In the case, I do not suggest you use callback function to read the TPM_CnV register once a capture events happen, the callback function is lengthy, so the callback function take a long time. I suggest you write the interrupt service routine yourself, do not use callback function. In the ISR of capture event, you just need to clear the TPM status register, read the TPM_CnV register, the code is simple, take less time.

I copy the code I develop based on K40 which use FTM, you can refer to it.

void singleCapture(void)

{

//enable FTM0 clock

SIM_SCGC6|=0x03000000; //enable FTM0 and FTM0 module clock

SIM_SCGC5=SIM_SCGC5|0x3E00; //enable port A/B/C/D/E clock

FTM0_SC=0x00;

FTM0_C0SC|=0x04;   //Capture on Rising Edge Only

FTM0_COMBINE=0x00; //clear

//enable capture interrupt

FTM0_C0SC|=0x40;  //enable CH0 interrupt

FTM0_SC|=0x08;

//in ISR of capture interrupt, read the FTM_c0V register to get the capture value

}

Void FTM0_ISR(void)

{

if(FTM0_STATUS&0x01)

{

       if(FTM0_STATUS&0x01)

       {

              tempPrev=temp;

              temp=FTM0_C0V; //read FTM0 counter register

              diff=temp-tempPrev;

              FTM0_STATUS&=0xFE; 

              //read Hall sensor logic

             //The FTM0_CH0 channel is multiplexed with GPIOC1, read GPIOC1 logic

              var0=GPIOC_PDIR;

              GPIOC_PTOR=0x80;    

              asm("nop"); //set a break point here

       }     

}

You have to configure the pin assignment.

Hope it can help you.

BR

XiangJun Rong

0 Kudos
Reply
2,626 Views
filipdossche
Contributor III

Hi Xiangjun,

I implemented it just like you suggested and it makes a world of difference.

I am still not getting the full required performance though, instead of getting a minimum time of 20 microseconds between CMP0 interrupts I am getting near to +/- 5 microseconds.

As soon as I add some extra functionality to the ISR the minimum interval gets bigger again. To speed things up I have eliminated every sort of abstraction and I am reading peripheral registers directly.

If I could find a solution for the processor performance issue I think I could get something that actually works.

In any case: thanks for the info, at least I ma partially there.

0 Kudos
Reply
2,625 Views
mjbcswitzerland
Specialist V

Filip

There are about 40 clocks overhead for the device to handle an interrupt and so I would expect about +/2us resolution for sampling to be possibe.

However it is generally recommended to use DMA to capture the samples to a buffer and then you will achieve the HW's resolution withot and potential interrupt response time issues.

Regards

Mark

Kinetis: µTasker Kinetis support

KL03: µTasker FRDM-KL03Z support

For the complete "out-of-the-box" Kinetis experience and faster time to market

2,626 Views
egoodii
Senior Contributor III

I don't KNOW that the part you mention HAS the connection of CPM0 to FTM 'capture', but I expect Mark is EXPECTING you to be using that function, and hooking the capture-interrupts to DMA for 'fast servicing'.  You might get some insight into using "dual-edge-capture from a Comparator-feed" in my post on QEI and 'index capture' (although there repetition-rate was not a concern, so no DMA -- and the index pass through the comparator was 'undesirable but unavoidable'):

QEI inputs, and index capture