Hi,
I am trying to improve the memory copy performance from to/from the device attached to the PCIe interface , processor is T2080 , core is e6500.
I can achieve the desired performance when i configure the PCIe BAR address space as cacheable but due to some restriction i should not make it
cacheable and need to transmit / receive data without caching.
I am thinking to make the c routine which uses the load multiple instruction (LMW) for memory copy.
Can i get some reference how to use this instruction in C routine for this task ?
OR if any other ideas to make it easier ?
Thanks,
Vijay