Hi Yuri
Thanks for your reply. If I understand the description of the errata correctly, the work-around prevents the Cortex-A9 core from entering "read-allocate/streaming"-mode, i.e. the CPU-core will perform write-allocations in L1-data-cache for all write accesses, even if unnecessary e.g. for memset()/memcpy(), and therefore cause lots of unnecessary bus-traffic and cache-trashing for code that writes lots of full cache-lines.
This will not only impact memset()/memcpy() but also quite a few drivers working with internal buffers, software audio- and video-codecs and renderers writing larger amounts of data to memory, probably some garbage-collectors of JVM and other similar code in OSses, libraries and applications.
For the Cortex-A5 and A7 cores ARM documents when exactly the core switches to "read allocate mode" (which would be prevented by the work-around):
<<<
To prevent this, the Bus Interface Unit (BIU) includes logic to detect when a full cache line has been written by the processor before the linefill has completed. If this situation is detected on three consecutive linefills, it switches into read allocate mode.
>>>
Unfortunately neither ARM nor Freescale seem to document, when the Cortex-A9 core usually would do this. Do you have any information whether the A9 uses the same detection mechanism as the A5 and A7?
Kind regards,
Marc