Get tick count (CPU cycle count) with SysTick

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Get tick count (CPU cycle count) with SysTick

Jump to solution
6,829 Views
concerned12345
Contributor III


Hello,

I want to get some timing information from my code and see how long certain algorithms run. I do not want to do this through debug. So, I have been looking at SysTick and I'm trying read the CVR register with the macro SYST_CVR. I keep getting zero as a result. From reading I am under the impression that this by default increments every clock cycle. Please let me know if I need to setup any of the other registers, if so which ones and with what values.

Also, if there is a nother way to get accurate CPU clock cycles please let me know.

I am running a K70 on a TWR-K70F120M rev. B tower board.

printf( "CVR: 0x%X\n", (unsigned int)SYST_CVR );     //Always prints: CVR: 0x0

Labels (1)
0 Kudos
1 Solution
3,249 Views
concerned12345
Contributor III

OK, I figured this out. I wasn't taking into account that the counter counts down and not up. I set the RVR register to 0x0 and since the CVR register counts from RVR to zero than loads RVR (which in my case was zero) I would always get zero.

Proper method:

SYST_RVR = 0xFFFFFFFF;

SYST_CSR = 0x7;

int nStart = SYST_CVR;

//Do something

int nStop = SYST_CVR;

int nChange = nStart - nStop;     //Start value will always be larger because the CVR register counts down.

View solution in original post

5 Replies
3,250 Views
concerned12345
Contributor III

OK, I figured this out. I wasn't taking into account that the counter counts down and not up. I set the RVR register to 0x0 and since the CVR register counts from RVR to zero than loads RVR (which in my case was zero) I would always get zero.

Proper method:

SYST_RVR = 0xFFFFFFFF;

SYST_CSR = 0x7;

int nStart = SYST_CVR;

//Do something

int nStop = SYST_CVR;

int nChange = nStart - nStop;     //Start value will always be larger because the CVR register counts down.

3,249 Views
vipindas
Contributor I

Hi,

I am using a K60 twr board, and i have a delay function like this.

void delay()

{

  unsigned int i, n;

  for(i=0;i<1000;i++)

  {

      asm("nop");

  }

}

When i checked the number of cpu cycles it takes,it showed around 3000, which is theoretically correct (optimized code for speed, so 1000nop + 1000 add + 1000 cmp)

Since my cpu is 120Mhz, if i set and reset a gpio before and after calling this delay, i expect

theoritically -> 3000 cpu cycles X 0.0083micro seconds = 25 micro seconds

but, practically (by setting and resetting gpio) i am seeing it as  around 143 micro seconds.

Any idea why this difference? I am really running my core @ 125?, or is my calcultion wrong somewhere?

thanks in advance

0 Kudos
3,249 Views
egoodii
Senior Contributor III

Show us the assembly-result for this routine, and tell us whether you are running it in RAM or FLASH, and if FLASH whether you have the hardware speed-ups enabled.

0 Kudos
3,249 Views
vipindas
Contributor I

Hi,

Please find the below routine, and i am running from RAM.

A question, if i run in flash, how do i enable hardware speed-ups?

c code

while(1) {

toggle led();

delay();

}

Assembly routine

          delay:

1fff02b6:   nop

1fff02b8:   subs r0,r4,#1

1fff02ba:   nop

1fff02bc:   subs r4,r0,#1

1fff02be:   bne delay (0x1fff02b6)      ; 0x1fff02b6

1fff02c0:   b main+0x38 (0x1fff0294)  ; 0x1fff0294

1fff02c2:   nop

          exit:

delay routine

void delay()

{

  unsigned int i, n;

  for(i=0;i<1000;i++)

  {

      asm("nop");

  }

}

0 Kudos
3,249 Views
egoodii
Senior Contributor III

Flash memory controls are in FMC.  For the K60, you have to watch 'old' mask 0M33 parts -- several FMC errata like 2448, 2647 and 2671 mean you have to turn-off these little pre-fetch, local cache, etc. 'speed up' features.  However, since you mention 120MHz I think we can assume you are NOT.  I can't say what the details are on your particular silicon -- but I might imagine that at 120MHz there might be an SRAM wait-state involved.  You show 5 words in your loop, but I don't quite find a register R4 initialization.  You should also be aware that in 'highly pipelined' architectures like ARM, 'branch' instructions have a cost 'far in excess of the obvious', probably 3 to 5 clocks for a 'branch taken' to re-seed the pipeline.  That all being said, I might see each loop taking 15 clocks.  Suffice it to say that in these modern, fast processors it becomes 'much more difficult' to just count instructions and predict performance.  I do agree, though, that you should check your CLOCK_DIV contents, and from the 'bus clock' some 'known division' to a baud-rate (for example), and work back to see that your core is indeed 120MHz.

0 Kudos