Cortex-M0+ performance 48MHz vs 96MHz

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Cortex-M0+ performance 48MHz vs 96MHz

717 Views
jh0
Contributor IV

Hi,

Let say that CPU is running at 48MHz and executing instructions from RAM. With KL27 things are clear. CPU and RAM are working on the same clock. Executing str R1, [R2] will take 2 CPU cycles. If destination is port, than it will take 3 CPU.

My question is, what will be if KL27 is replaced with KL28. If the KL28 CPU is running at 96MHz, are the RAM working on the same clock as CPU? And how many CPU cycles str R1, [R2] will take? What if destination is port? If we are looking pure execution of the same code from RAM, KL28 will be 2 times faster than KL27, or not? KL28 can reach 96MHz on 1.7V, or not?

Regards,

Josip

Labels (1)
4 Replies

539 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi,

Why you say if destination is port, str R1,[R2] will take 3 cycle? Do you mean apb bus bridge need additional cycle?
I guess you want to toggle GPIO as fast as possible, isn't it?
1.are the RAM working on the same clock as CPU?
Yes, they are both in DIVCORE clock domain.
2. how many CPU cycles str R1, [R2] will take?
It should same as KL27.
3. If we are looking pure execution of the same code from RAM, KL28 will be 2 times faster than KL27, or not?
Yes.
4.KL28 can reach 96MHz on 1.7V, or not?
I think the best way to toggle gpio is using zero wait state interface (IOPORT) for maximum pin performance. To single-cycle I/O port, str only need 1 cycle.
And if you compare KL27 and KL28 datasheet, you will find that KL28 port rise/fall time is much fast than KL27.

Regards,

Jing

539 Views
jh0
Contributor IV

Thank you for clearing things up.

Any of this port related instructions will take 3 CPU cycles...

// ldr R3, =0400FF0C0h

str R4, [R3, #PDDR] ; 3 Dir In
str R2, [R3, #PDOR] ; 3 Data Out

ldr R2, [R3, #PDIR] ; 3 Data In

str R5, [R3, #PDDR] ; 3 Dir Out

I didn't know about IOPORT. There is only short paragraph inside family reference manual (39.5.3), without any examples. So I only need (for example) to replace 400F_F0C0h to  F800_00C0h? And str instruction will take less than 3 cycles?

Edit: OK, I found it (39.4 FGPIO memory map and register definition) in RM.

0 Kudos

539 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi,

but if you look at the DDI0484C_cortex_m0p_r0p1_trm.pdf, load and store instruction take 2 cycle. Why is the additional cycle?

Regards,

Jing

0 Kudos

539 Views
jh0
Contributor IV

It will take 2 CPU cycles if ldr source or str destination is RAM. If source / destination is port (0400FF0C0h) than it will take one more CPU cycle. I guess that this is because buss is working on half of CPU clock.

Thank you for pointing me to IOPORT (FGPO). This resolved all my problems.

0 Kudos