Hi,
We are currently developing an application where we have to copy 16384Bytes of data from shared memory (protected by a Messaging Unit) to "normal" memory. The copy process is executed by the M7 co-processor at 800MHz an takes:
- ~150us on a non-cacheable memory area (110MB/s)
- ~90us on a cacheable memory area (190MB/s)
- ~50-90us when using stack memory (200-400MB/s)
Since we are working on shared-memory, we are talking about non-cacheable memory and therefore it takes 150us. To make the process faster, we tried to use the Memory to Memory function of the Smart Direct Memory Access Controller (SDMA). But also with that one:
- 133us on non-cacheable memory area (124MB/s)
This is a little bit faster, but still not really faster than the standard memcpy on the M7 processor. At 800MHz on the SDMA this would also mean, that it takes at least 7 instructions per byte. Or if we assume that the SDMA uses 32bit transfer, we would talk about 29 instructions per transfer.
All the memory areas are aligned at 32Bytes. Is there anything else which can be done to increase the SDMAs performance?
Used code:
sdma_handle_t g_SDMA_Handle = {0};
volatile bool g_Transfer_Done = false;
AT_NONCACHEABLE_SECTION_ALIGN(sdma_context_data_t context, 4);
AT_NONCACHEABLE_SECTION_ALIGN(uint32_t srcAddr[BUFF_LENGTH], 32);
AT_NONCACHEABLE_SECTION_ALIGN(uint32_t destAddr[BUFF_LENGTH], 32);
int main() {
...
SDMA_GetDefaultConfig(&userConfig);
userConfig.ratio = kSDMA_ARMClockFreq;
SDMA_Init(EXAMPLE_SDMAARM, &userConfig);
SDMA_CreateHandle(&g_SDMA_Handle, EXAMPLE_SDMAARM, 1, &context);
SDMA_SetCallback(&g_SDMA_Handle, SDMA_Callback, NULL);
SDMA_PrepareTransfer(&transferConfig, (uint32_t)srcAddr, (uint32_t)destAddr, sizeof(srcAddr[0]), sizeof(destAddr[0]),
sizeof(srcAddr[0]), sizeof(srcAddr), 0, kSDMA_PeripheralTypeMemory, kSDMA_MemoryToMemory );
SDMA_SubmitTransfer(&g_SDMA_Handle, &transferConfig);
SDMA_SetChannelPriority(EXAMPLE_SDMAARM, 1, 2U);
g_Transfer_Done = false;
uint32_t tStart = timerGetMicroSeconds();
SDMA_StartTransfer(&g_SDMA_Handle);
while (g_Transfer_Done != true){ }
int32_t timerDiff = timerDiffMicroSeconds(timerGetMicroSeconds(), tStart);
...
}