Hello,
Accessing shared memory from multiple cores in a SMP systems can be tricky. Disabling the caching for shared memory regions would be possible but very slow. Using atomic operations to access shared memory regions appears to be more reasonable in a SMP system.
a)
We were looking for 64bit atomic integer load/store in the e5500 Core Reference Manual. However, this appears to be not possible on PPC32 as ldarx/stdcx conditional instructions are only available on PPC64.
Is there some alternative, we are missing?
b)
Are 64bit floating point load/store operations lfd*/ stfd* always atomic on PPC32 (e5500 core)?
There appear to be no conditional instructions for floating point load/store. How is that safely handled between multiple cores in a SMP system?
Thanks
Solved! Go to Solution.
Disabling cache wouldn't solve anything (QorIQ chips support cache coherent SMP). You'd still need atomic operations.
64-bit atomic instructions are only available on 64-bit cores. If you need to synchronize access to a 64-bit variable on a 32-bit core (or running in 32-bit mode), you need to use a lock of some sort (or eliminate the need for the atomic variable to be 64-bit).
I don't understand what you mean by "PPC32 (e5500 core)". e5500 is a 64-bit core. In any case, aligned 64-bit floating point loads/stores are atomic even on 32-bit e500mc according to the e500mc manual section 4.9.1.1 ("Restarting Instructions After Partial Execution"). The same applies on e5500. However, this is just atomic store or atomic load; there is no atomic conditional floating-point store. If you need to synchronize access to floating point data, use a lock.
Disabling cache wouldn't solve anything (QorIQ chips support cache coherent SMP). You'd still need atomic operations.
64-bit atomic instructions are only available on 64-bit cores. If you need to synchronize access to a 64-bit variable on a 32-bit core (or running in 32-bit mode), you need to use a lock of some sort (or eliminate the need for the atomic variable to be 64-bit).
I don't understand what you mean by "PPC32 (e5500 core)". e5500 is a 64-bit core. In any case, aligned 64-bit floating point loads/stores are atomic even on 32-bit e500mc according to the e500mc manual section 4.9.1.1 ("Restarting Instructions After Partial Execution"). The same applies on e5500. However, this is just atomic store or atomic load; there is no atomic conditional floating-point store. If you need to synchronize access to floating point data, use a lock.
Well, then we have to use locks for protecting read/write access of larger data types and structures in shared memory.
I take it, that acquiring/spinning/releasing of that locks would be perfectly suited for atomic operations.
e.g.
struct {
bool_t lock;
char str[256];
uint64_t u64;
double f64;
} shared_memory_t;
I am not 100% clear about the QorIQ chips cache coherency and SMP. I read something about cache line size of an e5500 core being 64 bytes long.
Is wrapping shared data access with an atomic lock enough to ensure cache coherency of the data too?
e.g.
Once we have acquired the lock (probably by an atomicCas), do we also have to use atomic operations to manipulate the data elements in order to ensure cache coherency of the data elements?
Or can the data get manipulated by whatever means needed and the final atomic operation to release the lock (probably an atomicSet) would ensure cache coherency of the lock and data?
thanks
Yes, atomic operations (plus barriers) are used to implement locks. See appendix B of Book II in the Power ISA v2.06 for examples of how to implement locks.
Yes, the cache line size on e5500 is 64 bytes. Again, cache coherency and atomicity are completely unrelated topics. To ensure cache coherency, make sure the M bit is set on all TLB entries that map DDR.
You do not need to use atomic operations to manipulate data that is protected by a lock (otherwise what good would the lock be?), provided that you always hold the lock when you manipulate the data.