We want to calculate the time taken to read and compare two large buffer in uncached memory. The buffers are allocated using dma_alloc_coherent kernel function. For this we will need to find how much time a single 64 bit register load instruction takes. Can you give some pointers on how to find the time taken for the load to happen from uncached memory? We use the NXP i.MX 8MQuad Evaluation Kit.