Hello TomE,
The figure of 79 MB/s is a copy speed. If I benchmark read, write, and copy separately I get these figures:
Read = 136.7 MB/s
Write = 254.5 MB/s
Copy = 79.3 MB/s
This is with mobile-DDR clocked at 120MHz (16-bit wide).
Based on your figure of 10 clocks minimum for a 16-byte transfer, I make the maximum to be 192 MB/s. I can't find any timing diagrams in the datasheet, but it has to be less than 10 clocks to be able to get a write speed of 254 MB/s, in fact, no more than 7.5 clocks.
I'm starting to doubt my benchmark is correct now, although I can't see anything wrong. The code is thus:
move.l #65536,d1 ;number of 16-byte bursts to do (= 1MB)
moveq.l #64,d0 ;constant to add to address
movea.l #SDRAM_SCRATCH,a0 ;somewhere to write to
move.w #$2700,SR ;no interruptions please
move.l DTCN0,d7 ;start the clock
loop
subq.l #4,d1 ;dec loop counter (4 bursts) & update CCR
movem.l d2-d5,(a0) ;burst write
movem.l d2-d5,16(a0)
movem.l d2-d5,32(a0)
movem.l d2-d5,48(a0)
adda.l d0,a0 ;advance address
bgt.s loop ;CCR from subq earlier
move.l DTCN0,d0 ;stop the clock
move.w #$2000,SR
sub.l d7,d0 ;subtract start clock, d0 = microseconds to write 1MB
move.l #1000000000,d1 ;(one billion)
divu.l d0,d1 ;d1 is kB/s (or MB/s to 3 d.p.)
DTCN0 is free-running at 1MHz. I know this is right as it's used by the OS for generating delays. As you can see, I am just writing whatever garbage is in d2-d5, but that's not the point - it's a memory speed test.
Maybe I should write a DMA speed test - presumably that would give me the absolute memory speed as there'll be no instructions to slow it down.,,
As for the copyback, 1/2 the speed sounds right to me for a straight copy routine, as the CPU stalls while the cache writes the old line out before it can accept a new line. In write-through mode the write to DDR and the write to cache are done at the same time. I couldn't say whether it should be 1/2 the speed, or 1/3 of the speed, but I'm not surprised by the figures I'm getting.
Steve.