I'm working on a rather software-intensive application (video encoder) at the moment, and am trying everything possible to optimise my code. I have discovered that the on-board 32kB SRAM does not seem to be as fast as it should be:
After writing some simple data-copying benchmarks, I have the following results:
DDR to DDR copy = 40.8 MB/s (copyback mode)
DDR to DDR copy = 79.2 MB/s (write-through mode)
SRAM to SRAM copy = 129.6 MB/s
Cache to cache copy = 399.2 MB/s
(Platform is MCF54450 at 240MHz with mobile-DDR external memory. Copies run with simple MOVEM loops - not DMA)
Now, the datasheet says the SRAM is single-cycle and on the processor's local high-speed bus. Should this not therefore give a bandwidth of 960MB/s, or copy performance of 480MB/s for 32-bit wide memory? I'm getting this sort of speed from the cache, but the SRAM seems far too slow. If the SRAM is this slow, what's the point of it? Code or data structures will always be faster in cached external memory.
I have played about with the RAMBAR settings, and am accessing the SRAM properly - not through the backdoor!
Thanks for any advice!
PS. If anyone's interested in my benchmarking routines, let me know and I'll post them.