I am doing some timing analysis on our system that uses the IMX6ULL. I have found using the system tick counter that some instructions take a larger amount of cycles to execute than should take. The L1 data cache is enabled in our application (L1C_EnableDataCache();).
For example:
This C statement:
recordLength = p_record->recordLength / 4;
Assembly code of the above C statement:
80006fbe: add.w r3, sp, #24448 ; 0x5f80
80006fc2: add.w r3, r3, #20
80006fc6: ldr r3, [r3, #0]
80006fc8: ldr r3, [r3, #0]
80006fca: lsrs r2, r3, #2
80006fcc: movw r3, #4096 ; 0x1000
80006fd0: movt r3, #34419 ; 0x8673
80006fd4: str r2, [r3, #0]
Execution monitoring code:
clockTicks1 = (__MRC(15, 0, 9, 13, 0));
recordLength = p_record->recordLength / 4;
clockTicks2 = (__MRC(15, 0, 9, 13, 0));
totalTicks = clockTicks2 - clockTicks1;
The totalTicks count is consistently 200. This obviously is about 10 times the cycles the statement should take. Interrupts are turned off, so this code is not being interrupted.
The application is compiled with GCC.
Can anyone shed light on why this operation is taking so long?
Hi @kevincronn ,
I hope you are doing well.
Please accept my apologies for the delay in response.
Thanks & Regards,
Sanket Parekh