P2020 PCIE Device DMA data corruption

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

P2020 PCIE Device DMA data corruption

2,018 Views
ramasubramanian
Contributor II

I am using DMA streaming API in my PCIE device driver for DMA  transfers.

We use pci_dma_sync_single_for_device and pci_dma_sync_single_for_cpu to handover the buffer to device and cpu respectively.  We assume that this will ensure necessary cache invalidation and flushing as per the direction of transfer.

But we still see data corruption.  I tried reducing the DMA buffer size and the occurrence of the problem got delayed by 1 day.  It is clearly a cache invalidation/flusing issue.

Could you please let us know the correct API usage for the architecture?

Regards

Rams

Labels (1)
0 Kudos
19 Replies

1,286 Views
scottwood
NXP Employee
NXP Employee

p2020 has coherent DMA and thus the DMA sync functions will not do any flushing (you should still call them, though).  Check that the PCIe controller has been configured with snooping enabled.

Is there any chance that it's an ordering problem rather than coherency?  Is the driver using proper I/O accessors to ensure that barriers are used?  What happens if you insert manual flushing?

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Hi

You had mentioned in your earlier reply that:  Do you mean that it is required to have snooping enabled in both directions?  Right now, the snooping is enabled from PCIe device to CPU and not from CPU -> PCIe device.  Will that be an issue?

p2020 has coherent DMA and thus the DMA sync functions will not do any flushing (you should still call them, though). "Check that the PCIe controller has been configured with snooping enabled."

Is there any chance that it's an ordering problem rather than coherency?  Is the driver using proper I/O accessors to ensure that barriers are used?  What happens if you insert manual flushing?

Regards

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

In theory only the PCIe end should need to snoop.  But, there have been some other chips where this was not the case, so it wouldn't hurt to try enabling snooping on the CPU side -- i.e. the M bit in the TLB entry, which in Linux can be accomplished by using an SMP kernel.  Since this is a two-core chip, why aren't you already using an SMP kernel?  Or if you are, what are you seeing that makes you say that snooping is not currently enabled from the CPU side?

0 Kudos

1,286 Views
ramasubramanian
Contributor II

The PCIE controller on the device is snoop enabled in one direction, i.e device to CPU and no snoop is default for CPU to the device.

Thanks

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

Could you please explain precisely what register setting or equivalent you're referring to by "no snoop is default for CPU to the device"?

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Hi,

Sorry for the confusion.  I meant the PCIE controller core of the device and not the CPU has snooping enabled for the transaction from device to cpu and no snoop enabled for the transaction from cpu to device.

Thanks

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

That doesn't answer my question.  I only see one "snoop" setting in the PCIe controller.  The PCIe device should not be caching memory (except perhaps under explicit software control), so there's nothing to snoop.

0 Kudos

1,286 Views
ramasubramanian
Contributor II

We are using  SMP kernel which means snooping is enabled right_

Thanks

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

Yes, on an SMP kernel the M bit will be set in TLB entries that map normal RAM.

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Hi,

Thanks a lot for your response.

Snooping is enabled in PCIe Controller. Regarding manual flushing of cache, clean_dcache_range or flush_dcache_range is not available for e500.

1) Could you please provide the correct API for manual flushing and invalidation of cache

2) I am not ruling our ordering problem.  The driver does not use any barriers.  Could you please let me know the recommended barriers which will work for p2020? I tried using wmb()/eieio()/mmiowb() before and after DMA transfer.  It did not change the behaviour

Thanks

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

Why do you say that flush_dcache_range() and clean_dcache_range() are not available?

For a PCI driver you want to use writel()/readl(), or equivalent for other sizes.  These will contain the needed barriers (a sync before all accesses, and an isync after loads).

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Hi,

Thanks for the response.

Sorry, I was able to use flush_dcache_range().  I only have problems including clean_dcache_range() and invalidate_dcache_range().

When I compile the driver  with clean_dcache_range(), I get a WARNING: "clean_dcache_range" undefined!

I have replaced all the iowrite32 and ioread32 with writel() and readl(). 

I also do  flush_dcache_range() before and after DMA operations when the direction is DMA_TO_DEVICE.  

i.e Prior to DMA transfer, a flush_dcache_range() after  pci_map_single and Post DMA transfer a flush_dcache_range() before pci_unmap_single

I see minor improvement, the frequency of occurrence seems to be slightly reduced.  But I still see the problem.

Let me know your inputs

thanks

Rams

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

I don't see anything in the code that would explain why you can't call clean_dcache_range() or invalidate_dcache_range() -- what kernel are you using?  Though, remember that this is just for trying to figure out whether it's a cache issue.  It's not a permanent solution.  flush_dcache_range() should be sufficient for that purpose (and invalidate_dcache_range() can be dangerous).  Since you say you still see the issue with flush_dcache_range(), it does not seem to be a cache issue.

writel() versus iowrite32() shouldn't matter.

0 Kudos

1,286 Views
ramasubramanian
Contributor II

We use 3.14.39 kernel version.  Do you have any other ideas to rule out that this is not cache issue?

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

If flushing didn't make it go away, then it is probably not a cache issue.

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Does the order of flushing matters? Is that enough to flush before and after DMA transfer?

0 Kudos

1,286 Views
scottwood
NXP Employee
NXP Employee

You don't need to flush at all.  Investigate other potential sources of the problem.

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Ok.  I will try to setup dedicated buffers for TX & RX.  Currently, we use same buffer for both.  I will come back if I have any questions.

Thanks

Rams

0 Kudos

1,286 Views
ramasubramanian
Contributor II

Forgot to mention that the driver works fine on embedded ARM platform with the same PCIe device and also PC.  The corruption is seen when using with P2020.  That was the reason for looking into caching issues.!

0 Kudos