88W8997 module command timeout issue (interface PCIE+UART, Host: iMX8MQ) - reopen

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

88W8997 module command timeout issue (interface PCIE+UART, Host: iMX8MQ) - reopen

12,663 Views
yao_feng
Contributor III

The FW crashed issue is still exist in Generic_PCIE-WLAN-UART-BT-8997-LNX_6_6_3-IMX8-16.92.21.p119.2-16.92.21.p119.2-MM6X16437.P3-GPL.

follow previous case:

https://community.nxp.com/t5/Wireless-Connectivity/88W8997-module-command-timeout-issue-interface-PC...

and reopen by this case.  

@ArthurC
@Christine_Li
88W8997 

0 Kudos
Reply
169 Replies

929 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Please see below for our efforts:

Please find the log analysis.

[SAE logs Observations]

We have tried to replicate the customer scenario and we observe the disconnection but in our case the DUT will automatically connected below are the supplicant and dmesg in-sync logs which shows the re-connection procedure 

From wpa_supplicant logs

DUT disconnected from the external_AP

 

1651289273.940694: mlan0: Deauthentication notification

1651289273.940705: mlan0: * reason 7 (CLASS3_FRAME_FROM_NONASSOC_STA)

1651289273.940716: mlan0: * address bc:0f:9a:70:1f:69

1651289273.940722: Deauthentication frame IE(s) - hexdump(len=0): [NULL]

1651289273.940734: mlan0: CTRL-EVENT-DISCONNECTED bssid=bc:0f:9a:70:1f:69 reason=7

 

Then it will again try to authenticae with the external_AP automaticaly

1651289275.083236: nl80211: Authentication request send successfully

1651289275.297708: nl80211: Authenticate event

1651289275.297720: mlan0: Event AUTH (10) received

 

Association request send successfully

1651289275.476566: nl80211: Association request send successfully

1651289275.600792: nl80211: Associated with bc:0f:9a:70:1f:69

1651289275.600821: nl80211: Set drv->ssid based on scan res info to 'Dlink_2G'

 

Connected successfully with the external_AP

1651289276.563676: mlan0: CTRL-EVENT-CONNECTED - Connection to bc:0f:9a:70:1f:69 completed [id=0 id_str=]

 

 

From Dmesg logs

DUT disconnect with the external_AP

 

[20260.951205] wlan: HostMlme Disconnected: sub_type=12 bc:XX:XX:XX:1f:69

[20260.959918] HostMlme mlan0: Receive deauth/disassociate

 

It will send authentication frames to connect with the external_AP automatically

 

[20274.347553] wlan: HostMlme mlan0 send auth to bssid bc:XX:XX:XX:1f:69

[20274.358857] mlan0: 

[20274.358865] wlan: HostMlme Auth received from bc:XX:XX:XX:1f:69

 

Then it will sends the association request to the external_AP

 

[20274.436525] wlan: HostMlme mlan0 send assoicate to bssid bc:XX:XX:XX:1f:69

 

DUT will successfully connect with the external_AP

[20274.592698] wlan: HostMlme mlan0 Connected to bssid bc:XX:XX:XX:1f:69 successfully

 

[Customer logs Observations]

we have also analyzed your logs, and we can see the deauth

 

[ 170.745041] wlan: Received disassociation request on wlan0, reason: 3

[ 170.751537] wlan: REASON: (Deauth) Sending STA is leaving (or has left) IBSS or ESS

 

But after the DUT will also again connect successfully

 

[ 171.686526] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

 

From the sniffer logs i am not able to see the deauth/diassoc frames to map the dmesg logs.

 

We required the in-sync logs of sniffer, dmesg and supplicant to map the disconnection sequence. It will necessary to expedite the debugging process

 

Conclusion based on local and your logs analysis:

  • Not identified any suspect related to DUTSTA
  • Disconnection occurs from Peer AP.
  • Might be an open AIR issue.

 And also, we tried on both my side and  our SAE expertise side to reproduce. But unfortunately it becomes harder and harder to reproduce. I only reproduced one time from yesterday, but unfortunately sniffer is invalid.

So if you still can reproduce it easily, please help to test it in a shield room so that can capture full/more sniffer to match timestamped continuously dmesg logs.

dmesg -T -w > dmesg.txt &

 

Regards,

Christine.

 

Tags (1)
0 Kudos
Reply

952 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thanks for the new logs.

I checked it, and found that there is still useful information. Even the dmesg log is also not full.

You can use: 

dmesg -w > dmesg.txt &

This command will follow dmesg logs and continue to output to dmesg.txt file.

And after I checked, not found useful info, so I am trying to reproduce it by myself. But after 1.5hours, I still not reproduce it. It is so weird. Now I am retest it again after reboot my 8MQ-EVK.

Do you have a shield room?

Is it possible to go to shield room to test and capture sniffer logs? I think in open office, there is too many disturbs, so Wireshark will lose some packets.

If you can retest again and provide us a full dmesg and sniffer logs, will be appreciated so much.

At the same time, I asked our internal team help to double confirm your new shared logs to avoid me missing anything.

My test is continuing.

 

Best regards,

Christine.

0 Kudos
Reply

927 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

The sniffer & dmesg logs about test time from initial 88w8997 driver to iperf3 disconnection occured is attached. 

 

We don't know why we can't captured the deauth frame. Maybe it's caused by issue in 88w8997 firmware about 2.4GHz Wi-Fi connection.

 

For sniffer & dmesg logs(over 25MB), please find following link:

https://drive.google.com/drive/folders/1uh8yCNiy9h74U1jOUAJxSxZqvTkvan61?usp=sharing

 

0 Kudos
Reply

941 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thank you for sharing the logs,

 

From the dmesg logs can observe the deauth after the connection established

=============

[ 139.340997] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

[ 165.306659] wlan: Received disassociation request on wlan0, reason: 3

[ 165.313144] wlan: REASON: (Deauth) Sending STA is leaving (or has left) IBSS or ESS

[ 165.320869] Deauth: bc:XX:XX:XX:03:1a

 ============

After the deauth observed the DUT will again going to re-connect with the external_AP and connected sucessfully

 =================

[ 167.199926] #

[ 167.229106] Find bssid bc:XX:XX:XX:03:1a

[ 167.233116] wlan0:

[ 167.328622] wlan: HostMlme wlan0 send auth to bssid bc:XX:XX:XX:03:1a

[ 167.337045] wlan0:

[ 167.337062] wlan: HostMlme Auth received from bc:XX:XX:XX:03:1a

[ 167.345376] HostMlme wlan0: Received auth frame type = 0x0

[ 167.414845] wlan: HostMlme wlan0 send assoicate to bssid bc:XX:XX:XX:03:1a

[ 167.556802] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

====================

But from the sniffer I am not able to see the assoc, deauth frames.

We suggest you to share the in-sync log of both dmesg and sniffer again to identify the exact root cause of the issue.

And suggest to start the sniffer from the beginning of the connection establish procedure.

On my side, I also checked my reproduced dmesg logs and sniffer logs, unfortunately it didn't include useful info, too.

So at the same time, I will also re-test again, and try to capture more logs to our internal team. Hope we can test together, so that we can compare our logs and get more information to find out the root cause.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

989 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

The sniffer capture logs is as attached.

The capture time is form starting Bluetooth audio playing & doing iperf3 to target device with Netgear router AP connection to iperf3 stop caused by network disconnected (btoken).

 

The same issue as previous test.

The capture dmesg logs with drvdbg=0x80037.

 

 

 

0 Kudos
Reply

950 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

You can use another PC, insert the NetgearA6210 into it and capture sniffer logs.

This PC which inserted NetgearA6210 doesn't need connect to any Wi-Fi Access Point.

NetgearA6210 should be able to capture all of the sniffers over the air, it doesn't filter any WLAN address if you do not set any filter. 

And also for your information, I reproduced the connection broken issue, but I didn't reproduce the system hang issue after reconnection. I have provided my logs to our internal team. If you can also provide, will be appreciated, so that we can compare to see whether it is same issue.

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

954 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Update status on my side:

I tried on my side with I.MX8MQ-EVK and 88W8997(PCIE-UART CM276MA module) on Linux kernel 5.15.71 and our latest Q1-2024 Wi-Fi driver + FW.

I think I reproduced the issue of Wi-Fi connection broken. 

But I can not reproduce the issue of re-connection causing IMX8MQ system hang.

I will discuss with our internal team and try to capture sniffer logs on my side.

Best regars,

Christine.

Tags (1)
0 Kudos
Reply

940 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

We setuped the wireshark with as NETGEAR A6210 following showing:

 

ArthurC_0-1716269922205.pngArthurC_1-1716270152044.png

But we can't sniffer the target 88W8997 device socket (ip:133.33.33.15) which is doing iperf3 with another device at the same AP local network.

 

Could you help to provide the way to mointor whole devices' socket at AP local network?

 

 

 

0 Kudos
Reply

980 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

Yes, sure.

My available time are 15:00 on 05/22 & 05/24 this week.

And I'm on preparing the Wi-Fi dongle for sniffer log capture.

0 Kudos
Reply

962 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Can we have a live debug call with our internal team together?

So that we can better understand the test environment each other and also might provide some help about capture sniffer logs?

From our side, we sincerely hope to track and resolve the issue ASAP.

If it is possible, please let me know your available time. Our expertise team is in India, it is suggested to arrange afternoon (UTC+08:00) time.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

1,084 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Please check whether your Windows PC support monitor mode, if not support, you will need another Wi-Fi sniffer card such as Netgear A6210 WIFI Adapter or something like this.

If you have another Linux PC, you can also check whether its Wi-Fi card supports monitor mode.

I attach my written guide for how to capture Wi-Fi sniffer logs on both Windows and Linux PC for your reference.

Please see attachment.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

1,076 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

Could you help to provide the setup flow or guide about wireshark or others software to sniffer the needed logs?

 

Is there needed another device to sniffer?

 

Thank you

0 Kudos
Reply

1,068 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thanks for your quick reply.

To disable MSI is a test because we checked the original logs show:  

mwifiex_pcie 0000:01:00.0: no quirks enabled

If this test is not available, you can ignore it. We will still continue to track it with our internal team.

Can you please help to restart the test after board reboot and provide below request logs?

Provide dmesg logs + dump + sniffer logs with below parameters? 

  • drvdbg=0x80037
  • auto_fw_reload=0

Because we can not get more info from the dmesg logs about why  connection happened successful but after some time there is no communication. We request you help to provide sniffer logs to see what happened between STA(88W8997) and AP side.

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

1,017 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

 The last shared logs with drvdbg 80037, our system is not hang & keeping alive until now. The bluetooth is alive & playing audio via A2DP, too.  But Wi-Fi connection is broken. If we do action about Wi-Fi reconnection. The system will be hang by 88w8997.... 

 

 

MSI is interrupt about PCIE. If we disable it, the PCIE feature will be broken. Why 88w8997 is working incompetiable with MSI? Is there any error logs about MSI?

0 Kudos
Reply

970 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

In last shared logs with drvdbg 80037, connection happened successful but after some time there is no communication seen, may be your device hang and were not able to capture the logs.

From the last shared logs [wificrashed_info.7z] without drvdbg we can see disconnection, is it possible for you to share sniffer capture for it? So, it will be help to map with dmesg logs.

Also, with drvdbg=0x80037, Please help to disable auto_fw_reload through driver load parameter “auto_fw_reload=0” to capture proper dump.

So we have 2 request:

1. Please help to test with drvdbg=0x80037 and auto_fw_reload=0 when you load driver to provide dmesg logs and dump logs like before what you did. 

2. At the same time, please help to capture sniffer logs together with time-synced dmesg logs.

 Below is another test, please do it separately with above test.

Please help to disable MSI in u-boot, then reset your 8MQ board, then start your test, to see whether the issue is still be reproduces.

Solution is to disable MSI:

 u-boot //Can enter u-boot mode by click any key during boot process timer.

You can check what is the default mmcargs by below command:

print mmcargs

Christine_Li_0-1715926263341.png

 

Then add disable MSI parameters at the end by below command:

setenv mmcargs setenv bootargs console=${console} root=${mmcroot} pci=nomsi

saveenv

print mmcargs //Check whether your changing is valid.

reset

Christine_Li_1-1715926299787.png

 

Thanks,

Christine.

Tags (1)
0 Kudos
Reply

995 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

I am so sorry about the bad experience and I apologize to you sincerely.

We are making our big efforts to analyze each given logs and from previously given logs, it is concluded that different issues with before. But as result, even not same issue, it brings you a bad experience. Sorry for that. We never give up analyzing every time's logs to try to find any hints.

I know requesting you to create a new case to track similar phenomenon issue is also not a good experience, but track one issue with one case is better for future reference. Sometimes, we need to search similar issue if we meet some other customers reporting same issue. If creating new case takes you time, for sure, I can create it for you. But please believe us that we listen customer's voice sincerely and make us better.

Apologize again anyway for the experience!

We will analyze current logs and reply to you ASAP.

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

992 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

What we need is a workable & stable Wi-Fi & Bluetooth module via PCIE & UART.

After hard work, looks like there is word playing only...

Obviously, 88W8997 is not match our requirement.

If NXP want to fix issues and make product better, sincerely.

Please keep going to fix it. 

 

We have been noticed before that we are using Network Management V1.36.2  for Wi-Fi, not  wpa_supplicant.

 

The capture dmesg logs with drvdbg=0x80037 about Wi-Fi connection broken during test is as attached.

 

 

0 Kudos
Reply

858 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

I am attaching our internal team's detailed analysis of shared logs till now with the latest release.

Conclusion:

  • The reported issue(0x107 timeout issue) is not seen.
  • Scanning working properly. Looks like some different issue.
  • Once the exact issue is identified we will close this case and open new for better tracking.

Please help to capture logs with drvdbg=0x80037 and also share sniffer capture. and the below details.

  • wpa_supplicant version and configurations

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

880 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

As the same test method.

When Wi-Fi connection broken issue occured then we do ' echo "debug_dump" >/proc/mwlan/adapter0/config ', and module caused system hang.

So we can't record logs about fw_dump.

Looks like "Timeout cmd id (250.844023) = 0x107" didn't be solved. 

As attached photo showing.

 

ArthurC_1-1715681248200.png

 

 

Tags (1)
0 Kudos
Reply

873 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thanks for your reply.

I checked the given logs, it seems different issue with before. There is neither FW dump nor "Timeout cmd id (250.844023) = 0x107". But seems there is PCIE error. From drv_dump, I saw there are some PCIE Registers dump. 

We will discuss internally about this issue.

And at the same time, please double check the test method is same with your previous one and also we might have a request to test again. Just let you know firstly. But please wait for my confirmation with our internal team in case there are other logs needed.

Below is for your information:

If the issue "Timeout cmd id (250.844023) = 0x107" is hit or any firmware crashed issue in case of without drvdbg=0xa0037 then you can take/capture the manual dump using the below parameter before system reboot.

echo "debug_dump" >/proc/mwlan/adapter0/config

cat /proc/mwlan/adapter0/drv_dump > file_drv_dump

cat /proc/mwlan/adapter0/fw_dump > file_fw_dump

 

Best regards,

Christine.

 

Tags (1)
0 Kudos
Reply

868 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

When we test the iperf3 about 2.4G Wi-Fi & Bluetooth A2DP audio play in the same time.

The 2.4G Wi-Fi is working alive about 400ms only then connection broken.

We do fw_dump, drv_dump & dmesg as attached.

Then we try to disconnect AP & re-connect again.

The modue cause system hang.....

 

Please help to fix critical issues about whole system hang & connection crash. 

 

Tags (1)
0 Kudos
Reply