Hello, community!
In the meantime I got contacted by an NXP FAE, a colleague of @AldoG but we did not receive much support so far, unfortunately.
We see that the same hardware shows errors with Kernel 6.x, and _not_ with Kernel 5.10. Therefore we are sure that the issue can be solved in software.
We started reverse-engineering the driver, hoping to find a workaround.
So far, we did not have success, but we saw that the errors disappear if we disable the PCIe lane that is usually active in addition to the SATA lane (we have fsl,hsio-cfg = "pciea-pcieb-sata", and use the pcieb lane along with SATA).
Our conclusion is that there seems to be an interference:
- either from the PCIe driver pci-imx6.c accessing one of the three PHYs, and thus corrupting the SATA line calibration
- or from the PCIe traffic somehow influencing the SATA signal integrity
We initially guessed that there might be a race condition between the two drivers simultaneously accessing the PHY registers (`phy-fsl-imx8qm-hsio.c`). This was not confirmed because the errors persisted also after configuring the PCIe driver as `late_initcall()`, thus starting it after SATA calibration is completed.
Would be great to hear an opinion from NXP who have authored the drivers and should have a better idea about the reason for the errors.
Best regards
Olivaw