Assembly to machine code timing

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Assembly to machine code timing

1,883 次查看
ThomasG
Contributor I
I have a question about how long an assembly operation actually takes to execute on the S12X core processor.  I am using an MC9S12XDP512 chip with a 20 MHz oscillator.
 
I have a process that is interfacing with an FPGA and speed is an issue.  I placed a testbench signal that raises a pin high before this process starts and is set low once the process is done.  When looking at this pin on the scope, the process took much longer than I expected.
 
After this, I placed this testbench around two assembly statments.  The testbench signal was on for approximately 1 microsecond which seems very high.
 
Anyways, I know that assembly to machine code is not a direct 1-to-1 translation but this timing seems more like a 1-to-10 relationship.  Is this correct or am I missing something.
 
Finally, the clock and all clock monitors in the CRG are good and reporting no errors with the clock (I thought this might be caused by entering the self clock mode but this is not the case).
 
Thanks,
 
Thomas
标签 (1)
0 项奖励
回复
2 回复数

901 次查看
mke_et
Contributor IV
When you say 'much longer than expected', just what did you expect?

Did you sit down with a listing, look up the execution times (in cycles) for each of the instructions and then add them together appropriately? (Remember, some conditionals are variable and depend on the path taken.) Then once you have the total cycle count, did you then convert the count to time based on what your clock configuration?

I've found the Star12 stuff to be faster than snot. (Can that be a new technical standard? Speeds of snot 1 and snot 2...)

Seriously, assembly IS machine code. When you program in assembler, you're just using the mnemonics for the individual machine code equivilent. It IS 1:1. However, you may not fully understand what that '1' actually is.

Mike
0 项奖励
回复

901 次查看
kef
Specialist I

I have a question about how long an assembly operation actually takes to execute on the S12X core processor.  I am using an MC9S12XDP512 chip with a 20 MHz oscillator.
 
Without the use of PLL - bus clock frequency is oscilator_frequency / 2 = 10MHz.
 
I have a process that is interfacing with an FPGA and speed is an issue.  I placed a testbench signal that raises a pin high before this process starts and is set low once the process is done.  When looking at this pin on the scope, the process took much longer than I expected.
 
And what did you expect? Also, are you using PRU port for this. PRU ports are slower than non-PRU ports. And PRU ports are PORTA,B,C,D,E and K. These are listed in PRR Listing table in the datasheet.
 
After this, I placed this testbench around two assembly statments.  The testbench signal was on for approximately 1 microsecond which seems very high.
 
What instructions did you use for this, BSET and BCLR? Check S12XCPUV1 Reference Manual. BSET/BCLR normaly take 4 cycles each, either "rPwO" or "rPwP". There are two data access cycles in each instruction 'r' and 'w'. Access to nonPRU port takes 1 bus cycle while access to PRU port takes 2 bus cycles. So this code:
 
   BSET port,#mask
   BCLR port,#mask
 
takes 10 cycles if port is PRU and 8 cycles if port is not PRU. 10 cycles at 10MHz take 1us, 8 cycles - 0.8us.
 
Anyways, I know that assembly to machine code is not a direct 1-to-1 translation but this timing seems more like a 1-to-10 relationship.  Is this correct or am I missing something.
 
1) Use PLL to raise bus frequency to 40MHz, your CPU will then run 4 times faster.  2) learn instruction nomenclature. Simple store instroctions take twice shorter than set bit and clear bit instructions.
 
0 项奖励
回复