88W8997 module command timeout issue (interface PCIE+UART, Host: iMX8MQ) - reopen

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

88W8997 module command timeout issue (interface PCIE+UART, Host: iMX8MQ) - reopen

11,184 Views
yao_feng
Contributor III

The FW crashed issue is still exist in Generic_PCIE-WLAN-UART-BT-8997-LNX_6_6_3-IMX8-16.92.21.p119.2-16.92.21.p119.2-MM6X16437.P3-GPL.

follow previous case:

https://community.nxp.com/t5/Wireless-Connectivity/88W8997-module-command-timeout-issue-interface-PC...

and reopen by this case.  

@ArthurC
@Christine_Li
88W8997 

0 Kudos
Reply
154 Replies

854 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

I know this issue has been bothered you too much, but it has similar scenario but different hints from logs. These days we are still actively trying to reproduce and capture logs, but for one week stress test during working time, and unfortunately could not reproduce. But we never give up.

Can you please help to corporate and provide requested verification results and logs?

By the way, I have a suggestion, is it possible to send one of your board(after configured and setup ready for test) to us? After we reproduced and found solution for this case, we will send back to you for sure.

The only goal is finding root cause and fix the issue.

Sincerely looking forward to your reply.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

933 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Another request, to debug the issue further we need try the below combination once at your end to identify the issue is in driver or FW:

Driver version: MM6X16437.P3-GPL

FW version: 16.92.21.p255 (Debug Test FW of DEC/2023) from previous post (https://community.nxp.com/t5/Wireless-Connectivity/88W8997-module-command-timeout-issue-interface-PC...)

It will help to identify the issue by comparing the older and newer version, we can debug the issue further according to the your feedback on above combination.

Please help to capture the dmesg/dump and the sniffer capture.

Thank you so much.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

872 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

We sincerely regret for the inconvenience but it is very difficult to debug without the in-sync sniffer logs, We are working actively to identify the root cause of the issue,

We are in the process with our internal team to identify the suspect of the issue.

Can we have the in-sync nmcli/dmesg/sniffer logs and also your Yocto image of i.MX8MQ with your ported changes for the local reproduction?

You can send to my private email, not share on the public community to avoid information leakage.

Appreciated so much!

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

887 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

1.If possible, can you please share a detailed steps on your side? So that we can do absolutely same test with you to make sure we can reproduce it locally. Hopefully, the test steps are based on our 8MQ-EVK.

-> No, unfortunately, we have spent too much resource on it, and we can't deal with it thease days.

 

2.Can you please check with another AP to see whether can reproduce? From both my side and our SAE side, it might be related to the test environment.

-> No, in previous version about our product, we used the Board COM's solution bcm4356, and there is no issue occured with the same environment & command.

We are test in office only. Is it too weak to use in common environment?

 

3.Are you also testing with a dynamic IP address assigned by AP with udhcpc command? Or you are using a static IP address? Can you please provide me your command?

-> We use dynamic IP address assigned by AP only.

We port NetworkManager in our image to replace wpa-supplicant.

our command:

nmcli dev wifi connect <AP SSID> password <AP password> ifname wlan0

Please find refernece as following link:

NetworkManager - ArchWiki (archlinux.org)

 

After NetworkManager command working, the dynamic IP address will be assigned by AP automatically.

 

When issue occured, we do udhcpc command & it doesn't work. 

Our IP still there but connection broken.

 

 

 

0 Kudos
Reply

878 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Firstly, I am so sorry about the bad experience. Yes, I know you have pay a lot of time and patience on this issue. Thank you so much for the efforts. 

Secondly, we never give up to find the root cause and resolve this issue. Our SAE team is also tried to reproduce this issue locally, but unfortunately can not reproduce it even tested 12 hours continuously. And also on my side, the same test steps and test environment, I do not know why I can not reproduce it now. Last Friday I tried one day, but still can not reproduce it. Now I am still trying. To find the root cause and resolve this issue, we need more efforts. Hope we can work together and move forward.

1.If possible, can you please share a detailed steps on your side? So that we can do absolutely same test with you to make sure we can reproduce it locally. Hopefully, the test steps are based on our 8MQ-EVK.

2.Can you please check with another AP to see whether can reproduce? From both my side and our SAE side, it might be related to the test environment. 

3.Are you also testing with a dynamic IP address assigned by AP with udhcpc command? Or you are using a static IP address? Can you please provide me your command?

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

888 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

We think our target is to solve why 88w8997 caused disconnect & re-connect suddenly & unexpet by the firmware .

Why other devices will not be occured?

 

We have done lots of test for it but there is no positive action reply from SAE...

 

If 88w8997 is not compatiable for i.MX8MQ. We will never use it.

 

0 Kudos
Reply

850 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Please see below for our efforts:

Please find the log analysis.

[SAE logs Observations]

We have tried to replicate the customer scenario and we observe the disconnection but in our case the DUT will automatically connected below are the supplicant and dmesg in-sync logs which shows the re-connection procedure 

From wpa_supplicant logs

DUT disconnected from the external_AP

 

1651289273.940694: mlan0: Deauthentication notification

1651289273.940705: mlan0: * reason 7 (CLASS3_FRAME_FROM_NONASSOC_STA)

1651289273.940716: mlan0: * address bc:0f:9a:70:1f:69

1651289273.940722: Deauthentication frame IE(s) - hexdump(len=0): [NULL]

1651289273.940734: mlan0: CTRL-EVENT-DISCONNECTED bssid=bc:0f:9a:70:1f:69 reason=7

 

Then it will again try to authenticae with the external_AP automaticaly

1651289275.083236: nl80211: Authentication request send successfully

1651289275.297708: nl80211: Authenticate event

1651289275.297720: mlan0: Event AUTH (10) received

 

Association request send successfully

1651289275.476566: nl80211: Association request send successfully

1651289275.600792: nl80211: Associated with bc:0f:9a:70:1f:69

1651289275.600821: nl80211: Set drv->ssid based on scan res info to 'Dlink_2G'

 

Connected successfully with the external_AP

1651289276.563676: mlan0: CTRL-EVENT-CONNECTED - Connection to bc:0f:9a:70:1f:69 completed [id=0 id_str=]

 

 

From Dmesg logs

DUT disconnect with the external_AP

 

[20260.951205] wlan: HostMlme Disconnected: sub_type=12 bc:XX:XX:XX:1f:69

[20260.959918] HostMlme mlan0: Receive deauth/disassociate

 

It will send authentication frames to connect with the external_AP automatically

 

[20274.347553] wlan: HostMlme mlan0 send auth to bssid bc:XX:XX:XX:1f:69

[20274.358857] mlan0: 

[20274.358865] wlan: HostMlme Auth received from bc:XX:XX:XX:1f:69

 

Then it will sends the association request to the external_AP

 

[20274.436525] wlan: HostMlme mlan0 send assoicate to bssid bc:XX:XX:XX:1f:69

 

DUT will successfully connect with the external_AP

[20274.592698] wlan: HostMlme mlan0 Connected to bssid bc:XX:XX:XX:1f:69 successfully

 

[Customer logs Observations]

we have also analyzed your logs, and we can see the deauth

 

[ 170.745041] wlan: Received disassociation request on wlan0, reason: 3

[ 170.751537] wlan: REASON: (Deauth) Sending STA is leaving (or has left) IBSS or ESS

 

But after the DUT will also again connect successfully

 

[ 171.686526] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

 

From the sniffer logs i am not able to see the deauth/diassoc frames to map the dmesg logs.

 

We required the in-sync logs of sniffer, dmesg and supplicant to map the disconnection sequence. It will necessary to expedite the debugging process

 

Conclusion based on local and your logs analysis:

  • Not identified any suspect related to DUTSTA
  • Disconnection occurs from Peer AP.
  • Might be an open AIR issue.

 And also, we tried on both my side and  our SAE expertise side to reproduce. But unfortunately it becomes harder and harder to reproduce. I only reproduced one time from yesterday, but unfortunately sniffer is invalid.

So if you still can reproduce it easily, please help to test it in a shield room so that can capture full/more sniffer to match timestamped continuously dmesg logs.

dmesg -T -w > dmesg.txt &

 

Regards,

Christine.

 

Tags (1)
0 Kudos
Reply

873 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thanks for the new logs.

I checked it, and found that there is still useful information. Even the dmesg log is also not full.

You can use: 

dmesg -w > dmesg.txt &

This command will follow dmesg logs and continue to output to dmesg.txt file.

And after I checked, not found useful info, so I am trying to reproduce it by myself. But after 1.5hours, I still not reproduce it. It is so weird. Now I am retest it again after reboot my 8MQ-EVK.

Do you have a shield room?

Is it possible to go to shield room to test and capture sniffer logs? I think in open office, there is too many disturbs, so Wireshark will lose some packets.

If you can retest again and provide us a full dmesg and sniffer logs, will be appreciated so much.

At the same time, I asked our internal team help to double confirm your new shared logs to avoid me missing anything.

My test is continuing.

 

Best regards,

Christine.

0 Kudos
Reply

848 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

The sniffer & dmesg logs about test time from initial 88w8997 driver to iperf3 disconnection occured is attached. 

 

We don't know why we can't captured the deauth frame. Maybe it's caused by issue in 88w8997 firmware about 2.4GHz Wi-Fi connection.

 

For sniffer & dmesg logs(over 25MB), please find following link:

https://drive.google.com/drive/folders/1uh8yCNiy9h74U1jOUAJxSxZqvTkvan61?usp=sharing

 

0 Kudos
Reply

860 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thank you for sharing the logs,

 

From the dmesg logs can observe the deauth after the connection established

=============

[ 139.340997] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

[ 165.306659] wlan: Received disassociation request on wlan0, reason: 3

[ 165.313144] wlan: REASON: (Deauth) Sending STA is leaving (or has left) IBSS or ESS

[ 165.320869] Deauth: bc:XX:XX:XX:03:1a

 ============

After the deauth observed the DUT will again going to re-connect with the external_AP and connected sucessfully

 =================

[ 167.199926] #

[ 167.229106] Find bssid bc:XX:XX:XX:03:1a

[ 167.233116] wlan0:

[ 167.328622] wlan: HostMlme wlan0 send auth to bssid bc:XX:XX:XX:03:1a

[ 167.337045] wlan0:

[ 167.337062] wlan: HostMlme Auth received from bc:XX:XX:XX:03:1a

[ 167.345376] HostMlme wlan0: Received auth frame type = 0x0

[ 167.414845] wlan: HostMlme wlan0 send assoicate to bssid bc:XX:XX:XX:03:1a

[ 167.556802] wlan: HostMlme wlan0 Connected to bssid bc:XX:XX:XX:03:1a successfully

====================

But from the sniffer I am not able to see the assoc, deauth frames.

We suggest you to share the in-sync log of both dmesg and sniffer again to identify the exact root cause of the issue.

And suggest to start the sniffer from the beginning of the connection establish procedure.

On my side, I also checked my reproduced dmesg logs and sniffer logs, unfortunately it didn't include useful info, too.

So at the same time, I will also re-test again, and try to capture more logs to our internal team. Hope we can test together, so that we can compare our logs and get more information to find out the root cause.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

908 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

The sniffer capture logs is as attached.

The capture time is form starting Bluetooth audio playing & doing iperf3 to target device with Netgear router AP connection to iperf3 stop caused by network disconnected (btoken).

 

The same issue as previous test.

The capture dmesg logs with drvdbg=0x80037.

 

 

 

0 Kudos
Reply

867 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

You can use another PC, insert the NetgearA6210 into it and capture sniffer logs.

This PC which inserted NetgearA6210 doesn't need connect to any Wi-Fi Access Point.

NetgearA6210 should be able to capture all of the sniffers over the air, it doesn't filter any WLAN address if you do not set any filter. 

And also for your information, I reproduced the connection broken issue, but I didn't reproduce the system hang issue after reconnection. I have provided my logs to our internal team. If you can also provide, will be appreciated, so that we can compare to see whether it is same issue.

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

865 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Update status on my side:

I tried on my side with I.MX8MQ-EVK and 88W8997(PCIE-UART CM276MA module) on Linux kernel 5.15.71 and our latest Q1-2024 Wi-Fi driver + FW.

I think I reproduced the issue of Wi-Fi connection broken. 

But I can not reproduce the issue of re-connection causing IMX8MQ system hang.

I will discuss with our internal team and try to capture sniffer logs on my side.

Best regars,

Christine.

Tags (1)
0 Kudos
Reply

851 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

We setuped the wireshark with as NETGEAR A6210 following showing:

 

ArthurC_0-1716269922205.pngArthurC_1-1716270152044.png

But we can't sniffer the target 88W8997 device socket (ip:133.33.33.15) which is doing iperf3 with another device at the same AP local network.

 

Could you help to provide the way to mointor whole devices' socket at AP local network?

 

 

 

0 Kudos
Reply

890 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

Yes, sure.

My available time are 15:00 on 05/22 & 05/24 this week.

And I'm on preparing the Wi-Fi dongle for sniffer log capture.

0 Kudos
Reply

874 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Can we have a live debug call with our internal team together?

So that we can better understand the test environment each other and also might provide some help about capture sniffer logs?

From our side, we sincerely hope to track and resolve the issue ASAP.

If it is possible, please let me know your available time. Our expertise team is in India, it is suggested to arrange afternoon (UTC+08:00) time.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

996 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Please check whether your Windows PC support monitor mode, if not support, you will need another Wi-Fi sniffer card such as Netgear A6210 WIFI Adapter or something like this.

If you have another Linux PC, you can also check whether its Wi-Fi card supports monitor mode.

I attach my written guide for how to capture Wi-Fi sniffer logs on both Windows and Linux PC for your reference.

Please see attachment.

 

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

988 Views
ArthurC
Contributor III

Hello @Christine_Li ,

 

Could you help to provide the setup flow or guide about wireshark or others software to sniffer the needed logs?

 

Is there needed another device to sniffer?

 

Thank you

0 Kudos
Reply

980 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

Thanks for your quick reply.

To disable MSI is a test because we checked the original logs show:  

mwifiex_pcie 0000:01:00.0: no quirks enabled

If this test is not available, you can ignore it. We will still continue to track it with our internal team.

Can you please help to restart the test after board reboot and provide below request logs?

Provide dmesg logs + dump + sniffer logs with below parameters? 

  • drvdbg=0x80037
  • auto_fw_reload=0

Because we can not get more info from the dmesg logs about why  connection happened successful but after some time there is no communication. We request you help to provide sniffer logs to see what happened between STA(88W8997) and AP side.

Best regards,

Christine.

Tags (1)
0 Kudos
Reply

930 Views
ArthurC
Contributor III

Hello @Christine_Li,

 

 The last shared logs with drvdbg 80037, our system is not hang & keeping alive until now. The bluetooth is alive & playing audio via A2DP, too.  But Wi-Fi connection is broken. If we do action about Wi-Fi reconnection. The system will be hang by 88w8997.... 

 

 

MSI is interrupt about PCIE. If we disable it, the PCIE feature will be broken. Why 88w8997 is working incompetiable with MSI? Is there any error logs about MSI?

0 Kudos
Reply

895 Views
Christine_Li
NXP TechSupport
NXP TechSupport

Hi, @ArthurC 

In last shared logs with drvdbg 80037, connection happened successful but after some time there is no communication seen, may be your device hang and were not able to capture the logs.

From the last shared logs [wificrashed_info.7z] without drvdbg we can see disconnection, is it possible for you to share sniffer capture for it? So, it will be help to map with dmesg logs.

Also, with drvdbg=0x80037, Please help to disable auto_fw_reload through driver load parameter “auto_fw_reload=0” to capture proper dump.

So we have 2 request:

1. Please help to test with drvdbg=0x80037 and auto_fw_reload=0 when you load driver to provide dmesg logs and dump logs like before what you did. 

2. At the same time, please help to capture sniffer logs together with time-synced dmesg logs.

 Below is another test, please do it separately with above test.

Please help to disable MSI in u-boot, then reset your 8MQ board, then start your test, to see whether the issue is still be reproduces.

Solution is to disable MSI:

 u-boot //Can enter u-boot mode by click any key during boot process timer.

You can check what is the default mmcargs by below command:

print mmcargs

Christine_Li_0-1715926263341.png

 

Then add disable MSI parameters at the end by below command:

setenv mmcargs setenv bootargs console=${console} root=${mmcroot} pci=nomsi

saveenv

print mmcargs //Check whether your changing is valid.

reset

Christine_Li_1-1715926299787.png

 

Thanks,

Christine.

Tags (1)
0 Kudos
Reply