I'm having a hard time controlling the speed of DMA on the rt1062. In general, it seems to cap out at much lower transfer rates than I expect.
I've attached my minimal example. The only other change to the project was to enable output on LPUART2 instead of LPUART1.
The behavior I'm seeing and don't quite understand is that if I transfer a full buffer in one major loop iteration, it's substantially faster than doing many iterations of just one uint32_t per. My expectation is that the two would take roughly the same amount of time.
This example demonstrates this behavior with 'always enabled' setting on, but I've seen this with XBAR sources too.
Another question I have is if there is any documentation for clock times; specifically IPG_CLK_ROOT which drives DMA / DMAMUX / XBAR. The clock tool in MCUXpresso doesn't let me change that clock speed above 150mhz; however if I set it in code to 300mhz it seems to work and does give me a speed increase on DMA -- albeit with the same speed hit on multiple iterations I see with the clock at 150mhz.
As far as I can tell, I can transfer 1 uint32_t per 8 clock cycles on IPG_CLK_ROOT if I do it all in one iteration; and it's 15-16 clock cycles for multiple iterations.
My end goal is to transfer either 2 or 4 bytes at 30mhz+-1mhz. Is this achievable with this chip? With IPG_CLK at 300mhz; I can get _faster_ than that, but to achieve a normal bit rate, I need to be able to have PIT start a small 2/4 byte copy at regular intervals while still maintaining fast DMA operation.