Code execution taking a large amount of processor cycles on IMX6Ull

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Code execution taking a large amount of processor cycles on IMX6Ull

1,042 次查看
kevincronn
Contributor III

I am doing some timing analysis on our system that uses the IMX6ULL. I have found using the system tick counter that some instructions take a larger amount of cycles to execute than should take. The L1 data cache is enabled in our application (L1C_EnableDataCache();).

For example:

This C statement:

recordLength = p_record->recordLength / 4;

Assembly code of the above C statement:

80006fbe: add.w r3, sp, #24448 ; 0x5f80
80006fc2: add.w r3, r3, #20
80006fc6: ldr r3, [r3, #0]
80006fc8: ldr r3, [r3, #0]
80006fca: lsrs r2, r3, #2
80006fcc: movw r3, #4096 ; 0x1000
80006fd0: movt r3, #34419 ; 0x8673
80006fd4: str r2, [r3, #0]

Execution monitoring code:

clockTicks1 = (__MRC(15, 0, 9, 13, 0));
recordLength = p_record->recordLength / 4;
clockTicks2 = (__MRC(15, 0, 9, 13, 0));
totalTicks = clockTicks2 - clockTicks1;

 

The totalTicks count is consistently 200. This obviously is about 10 times the cycles the statement should take. Interrupts are turned off, so this code is not being interrupted.

The application is compiled with GCC.

Can anyone shed light on why this operation is taking so long?

 

 

 

0 项奖励
回复
1 回复

999 次查看
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @kevincronn ,

I hope you are doing well.
Please accept my apologies for the delay in response.

It could be because other background processes in linux may be using processor cycles.
 
For performance and cycle measurement, one can refer to PMU in ARMv7 architecture.
 
One can refer to the below-mentioned links for performance measurements using pmu.
 
Another easy way is to use perf (perf stat) utility to count cycles.
To measure performance in C source code one can refer to perf_event_open system call.
 

Thanks & Regards,
Sanket Parekh

0 项奖励
回复