Code execution taking a large amount of processor cycles on IMX6Ull

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Code execution taking a large amount of processor cycles on IMX6Ull

216 Views
kevincronn
Contributor III

I am doing some timing analysis on our system that uses the IMX6ULL. I have found using the system tick counter that some instructions take a larger amount of cycles to execute than should take. The L1 data cache is enabled in our application (L1C_EnableDataCache();).

For example:

This C statement:

recordLength = p_record->recordLength / 4;

Assembly code of the above C statement:

80006fbe: add.w r3, sp, #24448 ; 0x5f80
80006fc2: add.w r3, r3, #20
80006fc6: ldr r3, [r3, #0]
80006fc8: ldr r3, [r3, #0]
80006fca: lsrs r2, r3, #2
80006fcc: movw r3, #4096 ; 0x1000
80006fd0: movt r3, #34419 ; 0x8673
80006fd4: str r2, [r3, #0]

Execution monitoring code:

clockTicks1 = (__MRC(15, 0, 9, 13, 0));
recordLength = p_record->recordLength / 4;
clockTicks2 = (__MRC(15, 0, 9, 13, 0));
totalTicks = clockTicks2 - clockTicks1;

 

The totalTicks count is consistently 200. This obviously is about 10 times the cycles the statement should take. Interrupts are turned off, so this code is not being interrupted.

The application is compiled with GCC.

Can anyone shed light on why this operation is taking so long?

 

 

 

0 Kudos
1 Reply

173 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @kevincronn ,

I hope you are doing well.
Please accept my apologies for the delay in response.

It could be because other background processes in linux may be using processor cycles.
 
For performance and cycle measurement, one can refer to PMU in ARMv7 architecture.
 
One can refer to the below-mentioned links for performance measurements using pmu.
 
Another easy way is to use perf (perf stat) utility to count cycles.
To measure performance in C source code one can refer to perf_event_open system call.
 

Thanks & Regards,
Sanket Parekh

0 Kudos