DMA with 5472 - performace

I'm using dma on a 5472 to transfer data from memory to memory, using single buffer dma.

I want to know what performance I can expect doing this.  I seem to get around 590nS per long transfer. 

CPU clock input is 50MHz, flexbus is 50MHz, internal bus 100MHz, core is 200MHz.

I've tried sram to sram, both on flexbus - 16 bits wide, each 16 bit transfer was around 580nS.

I've tried SDRAM to graphics memory on the flexbus, 32 bits wide, each transfer again around 590nS.

Is this the best I can expect?  I was hoping for much faster.

Is there anything I can do to improve performance?  Is it possible to modify the bus arbitration so that the dma controller gets more bandwidth?