I am working on baremetal application on TWRK65 developing low level driver for dspi, my main goal is to extract maximum throughput out of SPI so i am using edma. Currently I am facing a problem which is causing greater loss in throughput. The design of SPI_PUSHR register is such that lower 16 bits hold data while upper 16 bits hold command flags. The problem is that I can not pass data alone to DMA source channel for TX transmission, i have to always iterate through byte by byte data and set required flags and copy in to another local buffer bearing data and flags variable. This whole data copying operation is hitting greatly in dspi throughput. Without this copy operation i get 13 Mbps c for 20 MHz clock which is very good, but for proper spi transaction i have to copy all the flags and data which limits my throughput to around 3 Mbps for same clock. Is there any alternative way to avoid copy operation and pass data directly to dma buffers ?