Force variable to Register

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Force variable to Register

2,853 Views
gbaars
Contributor I
Is it possible to declare a variable (in C) assigned
not to RAM but to a processor register so it
runs faster?
Labels (1)
0 Kudos
12 Replies

775 Views
gbaars
Contributor I
38 Mips was measured with a loop
whith 1 cycle instructions and a branch.
With the average of the entire inst.set
AVR has much
lower cycles than MCF51 running at
50MHz even performance will be less then
a 20 MHz tiny AVR.
 
0 Kudos

775 Views
RichTestardi
Senior Contributor II
Hi,
 
OK, I'll have to take your word for that since I have no experience at all with AVR.
 
I'm not sure what the best way is to "average" an instruction set, since the majority of code use a small subset of the instruction set.
 
(I admit I am partial to the ColdFire processors!  Even from the 68k days!  So I am glad you're at least trying them!!!)
 
-- Rich
 
0 Kudos

775 Views
gbaars
Contributor I
Maybe not in that case if
0.76 dhrystone 2.           1MIPS.
Since fastest instr. do 1 cycle and (somsort of) average. 1/0.76
= 1.3 c/inst. A lot of instructions though take 4 or more.
 
The goal was to evaluate the DEMOJM which came as a gift
and the Coldfire cpu. I am beginning to think XMEGA AVR
running a true 32 MIPS programming in ASM might
perform at a much faster average than a V1 coldfire
in 'C'. MIPS are important because my applications
involve DSP on audio.
0 Kudos

775 Views
RichTestardi
Senior Contributor II
Hi,
 
I have no reason to believe you are not seeing a true 38 (real) MIPS...  I suspect your code may be running from flash, and written in C as well.  So that seems reasonable.
 
I don't understand why you think a 32 MIPS processor (which is only 8/16 bit) might be faster?  I'd probably be surprised if that was the case.
 
I think the real trick will be coding your performance loop in assembly (whether using ColdFire *or* AVR) -- I'd be surprised if you could not get a factor of 2 speedup there...
 
Good luck!!!
 
-- Rich
 
0 Kudos

775 Views
gbaars
Contributor I
If from previous overview (0.76 DMIPS/MHZ)
the 0.76 is multiplied with 50 MHz the result
is 38 (DMIPS?). This may be a coincidence or not.
I measured 38 MIPS with the 1000x loop consisting of
23 mainly 1 cycle asm instructions.
If 2.1 MIPS/MHz is possible this can be due
to executing on both positive and negative
edges of the mcuclk.
Each time I run the debugger the command
window finally shows:
 
RESET
done .\cmd\CFV1_BDM_P&E_Multilink_CyclonePro_postload.cmd
 
Postload command file correctly executed.
main 0x410 T
Frequency change to ~24056572hz.
STARTED
RUNNING
Breakpoint
 
in>
 
this overrules the settings made with the processor
expert. If the setting was 20 MHz also this is app.
halved then to 10 MHz.  (?)
0 Kudos

775 Views
RichTestardi
Senior Contributor II
Oh!  I don't think 2.1 MIPS/MHz is possible on that core...
 
I believe 2.1 is actually the Dhrystone version.
 
The line in the data sheet says:

Provides 0.94 Dhrystone 2.1 MIPS per MHz
performance when running from internal RAM
(0.76 DMIPS/MHz from flash)

Which I interpret as meaning 0.94 DMIPS/MHz (DMIPS are "Dhrystone 2.1 MIPS ").
 
A DMIP is *almost* a real MIP, but usually on the optimistic side.
 
0 Kudos

775 Views
gbaars
Contributor I
The processor is a MCF51QE128CLK (80 pins).
The internal clock is set with processor expert
to 50 MHz and busclk divider is 2.
quote:
• 32-Bit Version 1 ColdFire® Central Processor Unit (CPU)
– Up to 50.33-MHz ColdFire CPU from 3.6V to 2.1V, and
20-MHz CPU at 2.1V to 1.8V across temperature range
of -40°C to 85°C
– Provides 0.94 Dhrystone 2.1 MIPS per MHz
performance when running from internal RAM
(0.76 DMIPS/MHz from flash)
unquote
I am not sure how DMIPS translate to MIPS but I would
like to see it run as fast as 2.1 x 50 MIPS from RAM.
Can Codewarrior do this or has it to be done by code?
0 Kudos

775 Views
RichTestardi
Senior Contributor II
Hi,
 
I'll start with the disclaimer that I'm a ColdFire performance novice!
 
If I understand everything you are saying correctly (you execute 23000 instructions in 600 us running at 50 MHz), it looks like you're running at about 1.3 clocks-per-instruction on a V1 core.
 
  23000 instructions; 600 us; 50e6 clocks/sec ? clocks/instruction
  = 1.3043478 clocks/instruction
 
From the reference manual (7.3.4 Instruction Execution Timing) clocks-per-instruction tables that does not seem unreasonable.
 
From the datasheet, I'd interpret the .94 DMIPS/MHz (i.e., 1.06 clocks-per-"sort-of instruction" ) as being a "not to exceed" number -- my experience with DMIPS is they are usually optimistic compared to real workloads, as the "instructions" they measure are not real instructions -- see http://en.wikipedia.org/wiki/DMIPS#Criticisms
 
It might be you have to recode your 23 instruction loop in assembly to go faster (i.e., to use less instructions)...  Or maybe you have already done that?
 
-- Rich
0 Kudos

775 Views
gbaars
Contributor I
-err that is busclk = 25 MHz.
0 Kudos

775 Views
RichTestardi
Senior Contributor II
What processor are you using?
 
0 Kudos

775 Views
gbaars
Contributor I
Thanks for reply. Version of CW = 5.9.0
With register code indeed translates
avoiding slower RAM.
With a 1000x for loop containing 23 asm
instructions and RPGIO_TOG 600 uS is
measured for 23000 asm instructions.
This results into near 38 MIPS.
(coreclk = 50 MHz, busclk 100 MHz intern).
Is near 100 MIPS somehow possible?
0 Kudos

775 Views
RichTestardi
Senior Contributor II
Hi,
 
You can use the "register" keyword to suggest (not force) that the compiler
put the variable in a register.  But this will depend not only on the optimization
level you choose (under project settings -> Code Generation -> Global
Optimizations), but for CW7.0 at least, also possibly on if you enable "Register
Coloring"  (under project settings -> Code Generation -> ColdFire Processor).
 
To use the "register" keyword is straightforward -- declare a variable like:
 
    register int i;
 
Then build and check your disassembly (right-click on the file and select
"Disassemble" ).
 
In general, I would also add that if you turn up the optimization level and enable
Register Coloring (for CW7.0), you probably will find you don't need to tell the
compiler to use a register -- it will often do so on its own.  However, this really
depends on the exact version of CodeWarrior you are using (OK, and I even
*assumed* you were using Code Warrior!).
 
Good luck.
 
-- Rich
 
0 Kudos