I have made the following benchmark test on a TWR-K70F120M board.
I have used the ARM CMSIS DSP library.
I have used the IAR Embedded Workbench 7.20.
Used if you can Poul-Erik.
freescale TWR-MK70F120M: ARM cortex M4 120 MHz
----------------------------------------------------------------------------------------
Function Time used/vector input type vector length
- arm_mult_f32 - 0.775 us ; // real float32 8
- arm_mult_f32 - 4.340 us ; // real float32 64
- arm_mult_f32 - 15.000 us ; // real float32 256
- arm_mult_f32 - 59.400 us ; // real float32 1024
- arm_mult_q31 - 0.950 us ; // real q31 8
- arm_mult_q31 - 5.280 us ; // real q31 64
- arm_mult_q31 - 16.467 us ; // real q31 256
- arm_mult_q31 - 64.800 us ; // real q31 1024
- arm_mult_q15 - 0.680 us ; // real q15 8
- arm_mult_q15 - 3.500 us ; // real q15 64
- arm_mult_q15 - 11.000 us ; // real q15 256
- arm_mult_q15 - 43.200 us ; // real q15 1024
----------------------------------------------------------------------------------------
Function Time used/sample input type
- sin - 0.880 us ; // real float32
- arm_sin_f32 - 0.480 us ; // real float32
- arm_sin_q31 - 0.393 us ; // real q31_t
- arm_sin_q15 - 0.286 us ; // real q15_t
- arm_sin_cos_f32 - 1.250 us ; // real float32
- arm_sin_cos_q31 - 1.910 us ; // real q31_t
- cos - 0.885 us ; // real float32
- arm_cos_f32 - 0.513 us ; // real float32
- arm_cos_q31 - 0.420 us ; // real q31_t
- arm_cos_q15 - 0.328 us ; // real q15_t
----------------------------------------------------------------------------------------
Filter type Time used/sample input type Taps Samples
- arm_fir_f32 - 3.080 us ; // real float32 20 1
- arm_fir_f32 - 0.910 us ; // real float32 20 20
- arm_fir_f32 - 0.673 us ; // real float32 20 40
- arm_fir_f32 - 0.650 us ; // real float32 20 80
- arm_fir_f32 - 5.280 us ; // real float32 40 1
- arm_fir_f32 - 1.620 us ; // real float32 40 20
- arm_fir_f32 - 1.155 us ; // real float32 40 40
- arm_fir_f32 - 1.125 us ; // real float32 40 80
- arm_fir_f32 - 9.840 us ; // real float32 80 1
- arm_fir_f32 - 3.210 us ; // real float32 80 20
- arm_fir_f32 - 2.183 us ; // real float32 80 40
- arm_fir_f32 - 2.144 us ; // real float32 80 80
- arm_fir_q31 - 4.680 us ; // real q31_t 20 1
- arm_fir_q31 - 3.190 us ; // real q31_t 20 20
- arm_fir_q31 - 3.150 us ; // real q31_t 20 40
- arm_fir_q31 - 3.112 us ; // real q31_t 20 80
- arm_fir_q31 - 8.333 us ; // real q31_t 40 1
- arm_fir_q31 - 6.033 us ; // real q31_t 40 20
- arm_fir_q31 - 5.933 us ; // real q31_t 40 40
- arm_fir_q31 - 5.911 us ; // real q31_t 40 80
- arm_fir_q31 - 15.800 us ; // real q31_t 80 1
- arm_fir_q31 - 11.700 us ; // real q31_t 80 20
- arm_fir_q31 - 11.550 us ; // real q31_t 80 40
- arm_fir_q31 - 11.500 us ; // real q31_t 80 80
- arm_fir_fast_q31 - 6.240 us ; // real q31_t 20 1
- arm_fir_fast_q31 - 0.885 us ; // real q31_t 20 20
- arm_fir_fast_q31 - 0.785 us ; // real q31_t 20 40
- arm_fir_fast_q31 - 0.729 us ; // real q31_t 20 80
- arm_fir_fast_q31 - 11.440 us ; // real q31_t 40 1
- arm_fir_fast_q31 - 1.520 us ; // real q31_t 40 20
- arm_fir_fast_q31 - 1.330 us ; // real q31_t 40 40
- arm_fir_fast_q31 - 1.237 us ; // real q31_t 40 80
- arm_fir_fast_q31 - 21.800 us ; // real q31_t 80 1
- arm_fir_fast_q31 - 2.770 us ; // real q31_t 80 20
- arm_fir_fast_q31 - 2.417 us ; // real q31_t 80 40
- arm_fir_fast_q31 - 2.244 us ; // real q31_t 80 80
- arm_fir_q15 - 2.640 us ; // real q15_t 20 1
- arm_fir_q15 - 0.945 us ; // real q15_t 20 20
- arm_fir_q15 - 0.905 us ; // real q15_t 20 40
- arm_fir_q15 - 0.885 us ; // real q15_t 20 80
- arm_fir_q15 - 4.178 us ; // real q15_t 40 1
- arm_fir_q15 - 1.650 us ; // real q15_t 40 20
- arm_fir_q15 - 1.585 us ; // real q15_t 40 40
- arm_fir_q15 - 1.560 us ; // real q15_t 40 80
- arm_fir_q15 - 7.333 us ; // real q15_t 80 1
- arm_fir_q15 - 3.058 us ; // real q15_t 80 20
- arm_fir_q15 - 2.958 us ; // real q15_t 80 40
- arm_fir_q15 - 2.817 us ; // real q15_t 80 80
- arm_fir_fast_q15 - 2.700 us ; // real q15_t 20 1
- arm_fir_fast_q15 - 0.423 us ; // real q15_t 20 20
- arm_fir_fast_q15 - 0.385 us ; // real q15_t 20 40
- arm_fir_fast_q15 - 0.367 us ; // real q15_t 20 80
- arm_fir_fast_q15 - 4.560 us ; // real q15_t 40 1
- arm_fir_fast_q15 - 0.690 us ; // real q15_t 40 20
- arm_fir_fast_q15 - 0.635 us ; // real q15_t 40 40
- arm_fir_fast_q15 - 0.606 us ; // real q15_t 40 80
- arm_fir_fast_q15 - 8.333 us ; // real q15_t 80 1
- arm_fir_fast_q15 - 1.230 us ; // real q15_t 80 20
- arm_fir_fast_q15 - 1.133 us ; // real q15_t 80 40
- arm_fir_fast_q15 - 1.075 us ; // real q15_t 80 80
----------------------------------------------------------------------------------------
Filter type Time used/sample input type sections Samples
- arm_biquad_cascade_df1_f32 - 0.924 us ; // real float32_t 1 1
- arm_biquad_cascade_df1_f32 - 0.195 us ; // real float32_t 1 20
- arm_biquad_cascade_df1_f32 - 0.178 us ; // real float32_t 1 40
- arm_biquad_cascade_df1_f32 - 0.170 us ; // real float32_t 1 80
- arm_biquad_cascade_df1_f32 - 1.536 us ; // real float32_t 2 1
- arm_biquad_cascade_df1_f32 - 0.377 us ; // real float32_t 2 20
- arm_biquad_cascade_df1_f32 - 0.350 us ; // real float32_t 2 40
- arm_biquad_cascade_df1_f32 - 0.336 us ; // real float32_t 2 80
- arm_biquad_cascade_df1_f32 - 2.067 us ; // real float32_t 3 1
- arm_biquad_cascade_df1_f32 - 0.553 us ; // real float32_t 3 20
- arm_biquad_cascade_df1_f32 - 0.520 us ; // real float32_t 3 40
- arm_biquad_cascade_df1_f32 - 0.502 us ; // real float32_t 3 80
- arm_biquad_cascade_df1_fast_q31 - 1.690 us ; // real q31_t 1 1
- arm_biquad_cascade_df1_fast_q31 - 0.382 us ; // real q31_t 1 20
- arm_biquad_cascade_df1_fast_q31 - 0.347 us ; // real q31_t 1 40
- arm_biquad_cascade_df1_fast_q31 - 0.330 us ; // real q31_t 1 80
- arm_biquad_cascade_df1_fast_q31 - 2.850 us ; // real q31_t 2 1
- arm_biquad_cascade_df1_fast_q31 - 0.740 us ; // real q31_t 2 20
- arm_biquad_cascade_df1_fast_q31 - 0.683 us ; // real q31_t 2 40
- arm_biquad_cascade_df1_fast_q31 - 0.655 us ; // real q31_t 2 80
- arm_biquad_cascade_df1_fast_q31 - 4.000 us ; // real q31_t 3 1
- arm_biquad_cascade_df1_fast_q31 - 1.095 us ; // real q31_t 3 20
- arm_biquad_cascade_df1_fast_q31 - 1.015 us ; // real q31_t 3 40
- arm_biquad_cascade_df1_fast_q31 - 0.975 us ; // real q31_t 3 80
- arm_biquad_cascade_df1_q31 - 1.076 us ; // real q31_t 1 1
- arm_biquad_cascade_df1_q31 - 0.205 us ; // real q31_t 1 20
- arm_biquad_cascade_df1_q31 - 0.180 us ; // real q31_t 1 40
- arm_biquad_cascade_df1_q31 - 0.169 us ; // real q31_t 1 80
- arm_biquad_cascade_df1_q31 - 1.647 us ; // real q31_t 2 1
- arm_biquad_cascade_df1_q31 - 0.383 us ; // real q31_t 2 20
- arm_biquad_cascade_df1_q31 - 0.347 us ; // real q31_t 2 40
- arm_biquad_cascade_df1_q31 - 0.332 us ; // real q31_t 2 80
- arm_biquad_cascade_df1_q31 - 2.220 us ; // real q31_t 3 1
- arm_biquad_cascade_df1_q31 - 0.563 us ; // real q31_t 3 20
- arm_biquad_cascade_df1_q31 - 0.512 us ; // real q31_t 3 40
- arm_biquad_cascade_df1_q31 - 0.493 us ; // real q31_t 3 80
----------------------------------------------------------------------------------------
Type Time used input type length
- arm_cfft_radix2_q15 - 40.8 us ; // real q15_t 64
- arm_cfft_radix2_q15 - 199.0 us ; // real q15_t 256
- arm_cfft_radix2_q15 - 940.0 us ; // real q15_t 1024
- arm_cfft_radix2_q31 - 119.6 us ; // real q31_t 64
- arm_cfft_radix2_q31 - 608.3 us ; // real q31_t 256
- arm_cfft_radix2_q31 - 2970.0 us ; // real q31_t 1024
- arm_cfft_radix2_f32 - 62.5 us ; // real float32_t 64
- arm_cfft_radix2_f32 - 327.0 us ; // real float32_t 256
- arm_cfft_radix2_f32 - 1615.0 us ; // real float32_t 1024
- arm_cfft_radix4_q15 - 28.5 us ; // real q15_t 64
- arm_cfft_radix4_q15 - 150.5 us ; // real q15_t 256
- arm_cfft_radix4_q15 - 755.0 us ; // real q15_t 1024
- arm_cfft_radix4_q31 - 74.6 us ; // real q31_t 64
- arm_cfft_radix4_q31 - 411.7 us ; // real q31_t 256
- arm_cfft_radix4_q31 - 2100.0 us ; // real q31_t 1024
- arm_cfft_radix4_f32 - 41.0 us ; // real float32_t 64
- arm_cfft_radix4_f32 - 221.0 us ; // real float32_t 256
- arm_cfft_radix4_f32 - 1110.0 us ; // real float32_t 1024
----------------------------------------------------------------------------------------
Original Attachment has been moved to: DSP_pre_test.zip
Hi Poul,
I am working on ARM Cortex-M4 and was looking for a FPU benchmark test. I found your test and downloaded it. Thanks for putting it on this link.
When I tried to compile, I found that the functions _time_get(), _time_diff(), the structure type TIME_STRUCT are not available in the library files which you have included (math.h/arm_math.h). Where are they located? Are they customized functions written by you?
Also, have you compared the figures which you have shown here with any reference values?
Please clarify.
Thanks,
Best regards,
Rama
This is a bit late but the TIME_STRUCT is actually part of the MQX RTOS.
It is used as an object tied to the system 32-bit clock iirc.