i.mx8 PCIe Wi-Fi Suspend/Resume Fails

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.mx8 PCIe Wi-Fi Suspend/Resume Fails

5,476 Views
davidkilp
Contributor IV

We are having a problem with suspend/resume of the PCIe interface on the i.MX8

We are using the i.MX8M Quad EVK board. I have tested this with the on-board Wi-Fi/BT module based on the QCA6174A chipset which uses a PCIe interface and also with some different Wi-Fi/BT M.2 modules from Intel and Sparklan which also use the PCIe interface. Regardless of the module it always fails to resume from sleep correctly.

The PCIe Wi-Fi for all the different cards and on-board works fine. I can scan, setup Wi-Fi connection, etc. as expected

If you suspend the board:

# echo mem > /sys/power/state

the device suspends correctly. It's the resume of the PCIe interface that seems to be a problem.

When you press the power button to resume from sleep I see the following type of errors. This is for the on-board Wi-Fi (QCA6174A):

[ 3423.382628] Enabling non-boot CPUs ...
[ 3423.383231] Detected VIPT I-cache on CPU1
[ 3423.383253] GICv3: CPU1: found redistributor 1 region 0:0x00000000388a0000
[ 3423.383287] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[ 3423.383742] CPU1 is up
[ 3423.384274] Detected VIPT I-cache on CPU2
[ 3423.384285] GICv3: CPU2: found redistributor 2 region 0:0x00000000388c0000
[ 3423.384302] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[ 3423.384573] CPU2 is up
[ 3423.385129] Detected VIPT I-cache on CPU3
[ 3423.385140] GICv3: CPU3: found redistributor 3 region 0:0x00000000388e0000
[ 3423.385156] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[ 3423.385442] CPU3 is up
[ 3423.416677] imx6q-pcie 33800000.pcie: PCIe PLL locked after 0 us.
[ 3423.516720] imx6q-pcie 33800000.pcie: Link up
[ 3423.516726] imx6q-pcie 33800000.pcie: Link up
[ 3423.516730] imx6q-pcie 33800000.pcie: Link up, Gen1
[ 3423.539686] [drm] Mode: 1920x1080p148500
[ 3423.564657] [drm] Pixel clock: 148500 KHz, character clock: 148500, bpc is 8-bit.
[ 3423.564662] [drm] VCO frequency is 5940000 KHz
[ 3423.643311] [drm] Sink Not Support SCDC
[ 3423.645504] [drm] No vendor infoframe
[ 3423.681344] caam 30900000.crypto: registering rng-caam
[ 3423.925352] usb 2-1: reset SuperSpeed Gen 1 USB device number 2 using xhci-hcd
[ 3424.100267] usb 1-1: reset high-speed USB device number 2 using xhci-hcd
[ 3424.804248] usb 1-1.2: reset low-speed USB device number 3 using xhci-hcd
[ 3424.893130] ath10k_pci 0000:01:00.0: failed to receive control response completion, polling..
[ 3425.916609] ath10k_pci 0000:01:00.0: Service connect timeout
[ 3425.916614] ath10k_pci 0000:01:00.0: failed to connect htt (-110)
[ 3425.996193] ath10k_pci 0000:01:00.0: Could not init core: -110
[ 3425.996197] ------------[ cut here ]------------
[ 3425.996200] Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.
[ 3425.996270] WARNING: CPU: 0 PID: 694 at net/mac80211/util.c:2203 ieee80211_reconfig+0x88/0xc50
[ 3425.996273] Modules linked in: crct10dif_ce ath10k_pci ath10k_core ath gpio_ir_recv rc_core
[ 3425.996286] CPU: 0 PID: 694 Comm: kworker/u8:27 Not tainted 5.4.70-00114-g4f2631b022d8 #3
[ 3425.996288] Hardware name: NXP i.MX8MQ EVK (DT)
[ 3425.996297] Workqueue: events_unbound async_run_entry_fn
[ 3425.996302] pstate: 40000005 (nZcv daif -PAN -UAO)
[ 3425.996305] pc : ieee80211_reconfig+0x88/0xc50
[ 3425.996309] lr : ieee80211_reconfig+0x88/0xc50
[ 3425.996311] sp : ffff8000125f3c30
[ 3425.996312] x29: ffff8000125f3c30 x28: ffff800011a06000
[ 3425.996316] x27: 00000000ffffff92 x26: ffff0000b800c020
[ 3425.996319] x25: 0000000000000000 x24: 0000000000000001
[ 3425.996322] x23: 0000000000000000 x22: ffff0000b80a5400
[ 3425.996326] x21: ffff0000b50a0440 x20: 0000000000000000
[ 3425.996329] x19: ffff0000b50a07c0 x18: 0000000000000001
[ 3425.996332] x17: 0000000000000000 x16: 0000000000000000
[ 3425.996335] x15: ffff800011a21000 x14: 2065757373692065
[ 3425.996339] x13: 72617774666f7320 x12: 6120656220646c75
[ 3425.996342] x11: 6f63207369685420 x10: 2e656d7573657220
[ 3425.996345] x9 : 6e6f707520656c62 x8 : 0000000000000000
[ 3425.996349] x7 : 0000000000000000 x6 : 00000006a70ff658
[ 3425.996352] x5 : ffff0000bd97a180 x4 : 0000000000000001
[ 3425.996355] x3 : ffff0000bd97a180 x2 : 0000000000000007
[ 3425.996358] x1 : 2c1c577728deed00 x0 : 0000000000000000
[ 3425.996362] Call trace:
[ 3425.996366] ieee80211_reconfig+0x88/0xc50
[ 3425.996371] ieee80211_resume+0x30/0x68
[ 3425.996376] wiphy_resume+0x74/0x90
[ 3425.996382] dpm_run_callback.isra.0+0x38/0xd8
[ 3425.996385] device_resume+0x80/0x170
[ 3425.996388] async_resume+0x24/0x58
[ 3425.996391] async_run_entry_fn+0x40/0x140
[ 3425.996396] process_one_work+0x198/0x320
[ 3425.996399] worker_thread+0x48/0x420
[ 3425.996402] kthread+0x138/0x158
[ 3425.996407] ret_from_fork+0x10/0x1c
[ 3425.996410] ---[ end trace 8cd2446a6fad611f ]---

.

.

[ 3425.997242] PM: dpm_run_callback(): wiphy_resume+0x0/0x90 returns -110
[ 3425.997246] PM: Device phy0 failed to resume async: error -110
[ 3425.998553] PM: resume devices took 2.460 seconds
[ 3427.112519] OOM killer enabled.
[ 3427.115662] Restarting tasks ... done.
[ 3427.121859] hantro receive hot notification event: 0
[ 3427.126961] PM: suspend exit

 

Other Wi-Fi M.2 cards result in similar errors and seems like the PCIe interface isn't coming out of sleep correctly or I have something configured wrong in the kernel?

I was testing this using the latest Linux L5.4.70_2.3.0_MX8MQ release but I've tried a number of earlier Linux and Android released images with the same result.

Is there a problem with PCI suspend/resume operation in general for the iMX8 or is there some kernel configuration that I'm missing?

Any help would be appreciated.

--David

 

0 Kudos
Reply
11 Replies

3,738 Views
Kunyang_Fan
Contributor II

Hi , 

It seems like the problem is fixed in BSP kernel 5.15 with the change  https://source.codeaurora.org/external/imx/linux-imx/commit/drivers/pci?h=lf-5.15.32-2.0.0&id=fdb997...  . Can someone tell us why the MSI setting should use PCIE_MSI_INTR0_MASK instead of PCIE_MSI_INTR0_ENABLE in file "drivers/pci/controller/dwc/pcie-designware-host.c "is required?

Thanks, 

Kunyang

3,804 Views
Kunyang_Fan
Contributor II

This issue seems like not a special case. We see the same problem in our boards using I.MX8MQ and I.MX8MP. The same problem "PCIE WIFI not work after suspend/resume" happens in our boards. The code base we use are rel_imx_5.4.70_2.3.7 for I.MX8MQ and lf-5.10.72-2.2.0  for I.MX8MP. In kernel 5.10.72, we can see the similar commits as below. Because we are using two PCIE ports, that the ethernet INTEL I210/I211 on anther PCIE will be not work after using the same changing logic for our WIFI modules. In the same result, this problem is not happened in old kernel version 4.14.98-y. Is anyone can help this.

commit e0026d0655f8540adb31569d459d260b2d10f26f
Author: Fugang Duan <fugang.duan@nxp.com>
Date: Mon Nov 4 13:48:52 2019 +0800

PCI: Disable MSI on marvel 88w9098 and 88w8997 chips

i.MX8x with MSI enable suspend/resume doesn't work for
marvell 88w9098 and 88w8997 wlan chips, disable the feature
before the issue fixed.

Signed-off-by: Fugang Duan <fugang.duan@nxp.com>

 

Thanks,

 

0 Kudos
Reply

5,123 Views
jorge2
Contributor I

Hi David,

Did you get to the bottom of this? I'm also looking at getting the Intel AX200 PCIe M.2 to work properly on the IMX8QM EVK, cause all those other NXP Q9098 based WIFI 6 boards are not easily available.

Thanks!

Jorge

 

 

0 Kudos
Reply

5,115 Views
davidkilp
Contributor IV

Jorge,

So I would say yes and no. I did find a work around which worked for me.

I'll list details here below for your information.

The iMX8Mq supports 2 PCIe interfaces and I was able to get various M.2 Wi-Fi/BT modules to work. However, there does seem to be an issue with PCIe suspend/resume not working correctly. In particular for all Wi-Fi cards tested resume would fail and the only recourse was to unload/reload the Wi-Fi kernel driver modules. Not really acceptable for an Android-based product that goes to sleep a lot.

I have discovered that NXP worked around this with their own/supported Wi-Fi PCIe modules by adding a PCI "quirks" (see <kernel_imx>/drivers/pci/quirks.c quirk_disable_all_msi() ) that disables MSI interrupts so the driver falls back to using different methods for signalling interrupts. This works OK for certain Wi-Fi cards I tested such as the Intel 9260ac card although it does cause reduced performance as measured with iperf. It does not work with all Wi-Fi cards however and certain cards fail to even power up properly when MSI interrupts are disabled. This seems to be a particular problem with the Wi-Fi 6 cards I was testing which included a Sparklan WNFB-265AXI(BT) module (Broadcom chipset) and the Intel AX200 card. These apparently require MSI interrupt to function at all.

The solution at least for now is just disable the suspend/resmume PCIe operations in drivers/pci/controller/dwc/pci-imx6.c. This worked on earlier kernel versions (4.19.42 using in Android Q10.0.0_1.0.0-ga) and appears to work on version 5.4.24 (using in Android Q10.0.0_2.3.0-ga) if the pm_async setting is disabled:

# echo 0 > /sys/power/pm_async

This option, which is normally enabled, causes Linux to suspend and resume all the devices in parallel to save time. Disabling async suspend/resume can be used to check if the suspend/hibernate failure is caused by some unknown device dependency issue. This can also be used to tune driver suspend/resume latency.

Its not clear how much extra power is used in suspend if the PCIe interface is left powered as Android turns off the Wi-Fi interface before going to sleep anyway. It does allow the unit to at least function "normally".

This probably needs further testing.

I'm not actively working on this right now but that's what I documented at the time. It's possible that there is later kernel version that fixes the MSI interrupt and PCIe sleep so it works correctly. 

Your mileage may vary as they say but you can certainly try my work around and see if that helps.

 

--David

0 Kudos
Reply

5,096 Views
jorge2
Contributor I

HI David,

I really appreciate this. 

I'm going to further work on it, and with this I and at least I know where to start looking. 

For my goals, I can even disable the PCIe suspend function for the M.2 board, so WIFI is always on, but I'll try to learn more as i dig into the code and try out a few module boards.

By any chance, do you know of any WIFI 6 sample M.2 board, that uses the NXP Q9098 or alike, which seems to be readily supported on the imx8mq, and that is actually available for purchase?

 

Thank a ton!

 

Jorge

 

0 Kudos
Reply

5,092 Views
davidkilp
Contributor IV

I'm afraid I don't know of any boards that use that particular chipset. I think that is fairly new? I know that we were investigating a Murata 1ZM module for a different product that used the 88W8987 but it's Wi-Fi 5 not 6 and uses SDIO interface. Looks like Murata is developing a module based on the 88W9098:

https://www.murata.com/en-us/products/connectivitymodule/wi-fi-bluetooth/overview/lineup/type1xl

but it's listed as "under development" so probably have to ask them directly as they might have samples. 

If you need the M.2 board format, the Intel AX200 is readily available, the drivers are already in the later kernels 5.x,  and it's performance on the i.MX8 platform was very good in my testing. Other than the suspend/resume issue it worked quite well.

 

0 Kudos
Reply

5,095 Views
davidkilp
Contributor IV

In case that wasn't clear as I was reading back through you need to disable the PM suspend/resume operations in the PCI driver code so it does NOT suspend the PCIe interface. What I did was a complete hack like this:

$ git diff
diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index c655d9de144e..72e47fd1d529 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -2308,6 +2308,8 @@ static void imx6_pcie_pm_turnoff(struct imx6_pcie *imx6_pcie)

static int imx6_pcie_suspend_noirq(struct device *dev)
{
+printk("====> Skipping imx6_pcie suspend ops\n");
+return 0;
struct imx6_pcie *imx6_pcie = dev_get_drvdata(dev);

if (!(imx6_pcie->drvdata->flags & IMX6_PCIE_FLAG_SUPPORTS_SUSPEND))
@@ -2333,6 +2335,8 @@ static int imx6_pcie_suspend_noirq(struct device *dev)

static int imx6_pcie_resume_noirq(struct device *dev)
{
+printk("====> Skipping imx6_pcie resume ops\n");
+return 0;
int ret;
struct imx6_pcie *imx6_pcie = dev_get_drvdata(dev);
struct pcie_port *pp = &imx6_pcie->pci->pp;

 

You could also just comment out the .pm operations setup in the platform_driver as well which would accomplish the same thing.

static struct platform_driver imx6_pcie_driver = {
.driver = {
.name= "imx6q-pcie",
.of_match_table = imx6_pcie_of_match,
.suppress_bind_attrs = true,
# .pm = &imx6_pcie_pm_ops,
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
},
.probe    = imx6_pcie_probe,
.shutdown = imx6_pcie_shutdown,
};

 

I put the comments in so I could see it during suspend/resume operations in the console.

 

5,046 Views
jorge2
Contributor I

Hi David,

I missed this message. I think this will help but I'm probably a few steps behind, still.

I got a few of PCIe Type E, AX200 modules, but building Android is now another beast, and I'm struggling with the documentation - lack of - on how to turn on kernel support for the Intel M.2 Wi-Fi.

I'm on 5.4 kernel with  Android Q10.0.0_2.3.0-ga. 

Could you help me out on how to make this happen? Or do you you know of any doc that details the required steps to achieve this and load the driver?

Big thanks.

Jorge

 

 

 

 

0 Kudos
Reply

5,417 Views
jimmychan
NXP TechSupport
NXP TechSupport

According to the release note of Android 10.0.0_2.5.0, the Wi-Fi was tested and supported is Azurewave CM276 (for i.MX8MQ). Seems the QCA6174A was not tested in the new release.

0 Kudos
Reply

5,395 Views
davidkilp
Contributor IV

Yes I understand I just find it a bit strange that a later revision of the EVK is not supported over an earlier revision. I can work around that but I still have an issue with PCIe suspend/resume failure.

It seems that in kernel changes AFTER Android 9 (Pie) release, the "fix" for getting Wi-Fi suspend/resume that is using the PCIe interface (either on-board or via M.2 socket) is to disable the MSI interrupt capability on the pci interface.

In kernels for Android-Q10.0.0_1.0.0 (4.19.42) and Android-Q10.0.0_2.6.0 (5.4.70) there was code added to the drivers/pci/quirks.c file to disable MSI for various Wi-Fi PCI ids.

In particular in 5.4.70 release there are some changes with comments like this for drivers/pci/quirks.c:

MLK-24939 PCI: only disable MSI for i.MX8 dwc root port

Disable MSI only for i.MX8 DWC PCIe RC by connecting
with some PCIe devices. The patch just avoid to impact
other platforms that connect with the same EP devices.
Will remove the msi disable quirk once the issue is fixed.

Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>

I added the PCI ids for the Intel 9260 M.2 module I was testing with to the quirks file and sure enough, while the driver complained out not having MSI capability it still works and suspend/resume worked correctly.

I should note that doing this causes a noticeable decrease in Wi-Fi performance as measured with iperf (400 Mbits/sec vs 600 Mbits/sec)

While the nomsi "fix" works for the 9260 card what I really want to get working is an Intel AX200 Wi-Fi 6 card. It works fine without the nomsi quirk in place but it won't suspend/resume correctly. If I add this cards pci id to the quirks file as I did for the 9260 card the AX200 card won't even start correctly. As the kernel driver for both of these cards is the same the only difference is the hardware/firmware of the card itself. It's possible that the AX200 card won't work without MSI enabled.

Given that the earlier Android Pie kernel, 4.17.79, did not need any msi quirks to function correctly, it seems that there is some regression of the pci driver?

The comments in the kernel where the quirks were added to disable msi seems to indicate that this might be removed once the issues is fixed? Is this being actively worked on and I should just wait for a fix or am I stuck? Is there a workaround for this issue? I don't mind being a tester...

5,434 Views
davidkilp
Contributor IV

I have some further information on this. I have the MCIMX8M-EVK which I think is Rev B4 version according to the label on the bottom. As mentioned it has the on-board Wi-Fi/BT module using the QCA6174A.

When I install Android 9 (P9.0.0-1.0.0-ga, 01/2019) release for this board which I think it originally came with suspend/resume works correctly. This is using the 4.14.78 kernel.

I also tried the Linux/Yocto release L4.14.78-1.0.0_ga, 01/2019 and it also supports Wi-Fi suspend/resume correctly. I also re-built the kernel for this release and turned on support for an Intel M.2 Wi-Fi  using a 9260NGW M.2 module it works correctly and suspends/resumes correctly as well. In fact, I can have both the on-board QCA6174A connected and the Intel 9260 M.2 modules connected to 2 different access points simultaneously and suspend/resume both work correctly.

From what I can tell from the official Android releases for the i.MX 8M Quad EVK boards only the Android P9.0.0-1.0.0-ga explicity says it supports the Rev B3/B4 version. The Wi-Fi module changes in Rev B1 to use the QCA6174A and the earlier boards (Rev A series). used a muRata module based on the Broadcom BCM4356/Cypress CYW4356 chipset.

Seems a bit strange that Rev B boards are not explicity supported in later Android Q versions and only Rev A is supported? I guess I can re-build the later Android images for my Rev B4 board and change the setup for Wi-Fi but seems strange to go from supporting Rev B3/B4 in Andoid P and then go back to earlier Rev A boards for Android Q?

If I install a later Android Q version, such as android-10.0.0_2.5.0 the board boots and Android comes up but Wi-Fi does not work.

--david