Running at a core clock of 120 MHz (bus clock of 60 MHz) I expected 150 MIPS (1.25 MIPS/MHz from data sheet). Bench testing results in 75 MIPS. What have I done wrong?
How do you come to this 75MIPS measure? 'Marketing level MIPS measures' are (often) taken from a processing-time measurement of particular 'common tasks' in a pre-defined performance-metric-set, like Dhrystone-MIPS. See section 5 of:
http://infocenter.arm.com/help/topic/com.arm.doc.dai0273a/DAI0273A_dhrystone_benchmarking.pdf
where the metric 1.25MIPS/MHz is specifically represented for the M3 core.
It can also include 'more than one op per instruction' in the M4 SIMD instructions.
Most M4 instructions take one clock, but that only 'counts' while the pipeline can cleanly flow. Code-flow-breaks are killers. Also, things like (slow) flash-access (I assume your flash clock is 30MHz) can stall operations.