I'm measuring CPU load from time t1 to t2 by inspecting how many times the counter in the RTOS idle task has been incremented and comparing this value with a reference value calculated at system startup. This appears to be a common way to measure CPU load and it has worked fine in the past for simpler ARM cores.
Q: Given the complexity of the Cortex-A8 core, is this way of measuring load reliable?
I figured it should be reliable, save for cases when the time span between the two measurements is extremely small and cache misses could influence the result in undesired ways. However, when measuring load when no tasks are running, I get a CPU load of up to 34%. The reason for this is that the idle loop runs slower, which I find to be quite odd.
A more accurate way is using the performance counter inside the cortex-a8. http://infocenter.arm.com/help/index.jsp