Using imx28 bare.metal, we found out that the typical access penalties for certain peripherals are higher than expected:
(this values are somewhat expected)
ABPX-BUS: write: 2-3 X Cycles -> 85-125ns
read: 4-6 X Cycles ->170-250ns
This would equal 50-100 waitstates if clocked with 454MHz.
(this was some surprise)
ABPX-BUS: read, write: 6 X Cycles -> 250ns
Using 32 Byte tx - fifo - this means that filling up the fifo may take around 8us, if you do that in a while loop checking for fifo full - 16us.
Bascially - using the Debug Uart is like having a 4MHz peripheral with corresponding waitstates.
This would equal 114 waitstates if clocked with 454MHz.
SPI / SSP:
(some lecture on H Bus)
ABPH-BUS: read, write: 8H Cycles -> 50ns
Because of multiplexed data/address on H bus + some clocks to overcome the different time domains - the actual access time
is around 50ns - which would correspond to access a 24MHz peripheral.
This would equal 23 waitstates if clocked with 454MHz.
This is the reason why using DMA for X and H bus is highly recommended - but not available for debug uart.
Another option is to separate consecutive writes - but thats no option for reads.
Has somebody experience with imx6 peripherals, busses ?
Is the performance similar as described here ?