FEC ethernet packetloss

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

FEC ethernet packetloss

4,173 Views
mikegilorma
Contributor II

I have been working on tracking down what I believe to be the same problem.  I have been working on a Variscite development kit with an iMX6 board and have a good test setup that I can somewhat regularly create the failure.  The interesting thing is that I thought I was dropping packets but in actuality the packets are getting to the receive side but as malformed packets.  Even more interesting is that the malformed packet is always the same and that it contains the following ASCII text in its payload:  "Copyright (C) 2007-2013 Freescale Semiconductor, Inc. All Rights Reserved".

I am using iperf3 to send data from the iMX6 board to my desktop PC.  See the link to my forum post on Variscite's wiki below.  Note: I've also verified this issue exists on an iMX7 based board.

Eth0 dropping packets - Variscite Forums 

15 Replies

2,350 Views
rans
Senior Contributor I

Hello Mike,

I face the same issue with imx ethernet.

Is it resolved ?

Thank you,

Ran

0 Kudos

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

Hello Mike,

You need to use the latest version of NXP  Linux OS.  Our latest Linux version is v4.1.15., in the sabre board, not sabre lite, so we can escalate the issue with the Linux Developers,  I apologize for the incovenience.

0 Kudos

2,350 Views
felixradensky
Contributor IV

Hello James,

I can reproduce the packet corruption problem on MX6 SabreSD running the latest NXP kernel, v4.1.15. I've used the following commands:

On SabreSD board:

cpufreq-set -g performance

iperf3 -u -c 192.168.1.101 -b 80M -l 1470 -t 360 -i 10  -P 5 -w 32M -A0

 

On Ubuntu 14.04 running kernel 4.4 and iperf 3.1.6

iperf3 -s -i 10 -A0

During the test SabreSD and PC are the only 2 hosts connected to unmanaged gigabit switch. No firewall rules defined on PC.

To avoid packet loss due to insufficient buffer space I've run the following commands on both PC and SabreSD:

SIZE=33554432                                                                   
                                                                                
echo $SIZE > /proc/sys/net/core/wmem_default                                    
echo $SIZE > /proc/sys/net/core/wmem_max                                        
                                                                                
echo $SIZE > /proc/sys/net/core/rmem_default                                    
echo $SIZE > /proc/sys/net/core/rmem_max                                        
                                                                                
echo 2000 > /proc/sys/net/core/netdev_max_backlog

The packet corruption is reproducible within 2-3 minutes. Please escalate this to your Linux developers.

Felix.

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

Hello Felix,

Thanks for the update, I already submit your steps, so they can validate.  

saludos,

Jaime

2,350 Views
felixradensky
Contributor IV

Hi Jaime,

Any input from your developers ? Did they reproduce the problem ?

Thanks.

Felix.

0 Kudos

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

Hello Felix,

Malformed packets usually are a product of the source. In this case the desktop running LTS 14.04.  Hopefully the wireshark is on an independant system on the same switch.  What I cannot understand is the entire malformed packet is corrupt from the beginning to the end of the MAC portion of the packet. The data seems to be untouched. 

 

I have a SABRE-SDB with a 6Q and 4.1.15, and it runs iperf3 only a few dropped packets, but we are talking insignificant numbers. 

 

In one test 5 packets out of 11.6 million and in another 4  packets out of 11.8 million.

 

Here are my results.  Can you post theirs?

 

Client Run #1 

root@imx6qdlsolo:~# iperf3 -u -c 192.168.0.1 -b 80M -l 1470 -t 360 -i 10  -P 5 -w 32M -A0
Connecting to host 192.168.0.1, port 5201

[deleted the traffic reports in between]

Final report

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-360.00 sec  3.20 GBytes  76.3 Mbits/sec  0.019 ms  1/2336053 (4.3e-05%)
[  4] Sent 2336053 datagrams
[  6]   0.00-360.00 sec  3.20 GBytes  76.3 Mbits/sec  0.021 ms  0/2336053 (0%)
[  6] Sent 2336053 datagrams
[  8]   0.00-360.00 sec  3.20 GBytes  76.3 Mbits/sec  0.010 ms  0/2336053 (0%)
[  8] Sent 2336053 datagrams
[ 10]   0.00-360.00 sec  3.20 GBytes  76.3 Mbits/sec  0.019 ms  3/2336053 (0.00013%)
[ 10] Sent 2336053 datagrams
[ 12]   0.00-360.00 sec  3.20 GBytes  76.3 Mbits/sec  0.018 ms  1/2336053 (4.3e-05%)
[ 12] Sent 2336053 datagrams
[SUM]   0.00-360.00 sec  16.0 GBytes   382 Mbits/sec  0.017 ms  5/11680265 (4.3e-05%)

- - - - - - - - - - - - - - - - - - - - - - - - -
So out of 11.6 GBytes transferred we only dropped 5 packets.

 

Client Run #2 

 - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datag           rams
[  4]   0.00-360.00 sec  3.24 GBytes  77.3 Mbits/sec  0.022 ms  0/2365207 (0%)
[  4] Sent 2365207 datagrams
[  6]   0.00-360.00 sec  3.24 GBytes  77.3 Mbits/sec  0.018 ms  0/2365207 (0%)
[  6] Sent 2365207 datagrams
[  8]   0.00-360.00 sec  3.24 GBytes  77.3 Mbits/sec  0.015 ms  0/2365207 (0%)
[  8] Sent 2365207 datagrams
[ 10]   0.00-360.00 sec  3.24 GBytes  77.3 Mbits/sec  0.020 ms  2/2365207 (8.5e-           05%)
[ 10] Sent 2365207 datagrams
[ 12]   0.00-360.00 sec  3.24 GBytes  77.3 Mbits/sec  0.023 ms  2/2365207 (8.5e-           05%)
[ 12] Sent 2365207 datagrams
[SUM]   0.00-360.00 sec  16.2 GBytes   386 Mbits/sec  0.020 ms  4/11826035 (3.4e           -05%)

 

So it dropped 4 packets out of 11.8 million  packets

 

 

Both of which are acceptable for UDP packets. UDP does not guarentee delivery so a momentary collision can produce these 'dropped' packets.

 

The wireshark reports of their corrupt packet show a bit drop error (1C versus 18, bit 2) on the MAC address which is weird.

 

My iperf3 -v on the source says :

iperf 3.0.11

Linux lucid-Sun-Ultra-20-Workstation 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

 

On my i.MX6 Sabre-SDB board it says:

iperf 3.1
Linux imx6qdlsolo 4.1.15-2.0.0+gb63f3f5 #1 SMP PREEMPT Fri Sep 16 15:02:15 CDT 2016 armv7l
Optional features available: CPU affinity setting, IPv6 flow label, TCP congestion algorithm setting, sendfile / zerocopy

 


Have a great day,
TIC

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

Also might be useful to get the full packet capture for both the regular packet and the malformed packet. 

0 Kudos

2,350 Views
mikegilorma
Contributor II

I believe we are dealing with a driver/kernel issue on the Variscite boards.  I have verified that a development kit from Congatec (Linux cgtqmx6 3.0.35-4.1.0+qmx6+gcc48cee #1 SMP PREEMPT Mon Oct 27 13:19:21 CET 2014 armv7l GNU/Linux) runs error free for 40+ hours.   When the Variscite guys told me to roll back to an older kernel I still saw drops but the failure mode was different.  To be clear, I am only testing the TX side of things at this point (imx6 board sending udp packets to a desktop pc).

When testing with Variscite's latest build I see dropped packets on the desktop pc that show up in iperf3 but not in ifconfig.  I was able to capture the malformed packets in wireshark and have attached a pcap showing this.

imx6 wireshark

imx7 wireshark

When testing with Variscite's 3.0.35 based build I still see dropped packets on the desktop pc that show up in iperf3 but not in ifconfig.  It appears that the dropped packets are not being sent out the wire.

0 Kudos

2,350 Views
felixradensky
Contributor IV

As I mentioned earlier, we are able reproduce the corrupted packet problem on Freescale MX6 evaluation board (Sabre SD) running latest community kernel (4.1.38), so this is definitely not a Variscite hardware problem, but rather a FEC driver problem. The corrupted packets are identical to those reported by Mike. Is seems that FEC driver is copying the data from from zero-page with Freescale exception vectors at the beginning. The problem is also reproducible on Variscite VAR-SOM-MX6 board running latest Freescale kernel 4.1.15_2.0.0. On the other hand, the problem is not reproducible on the same hardware running older Freescale kernel (3.0.35). This is another proof that the problem is not a hardware one. I think git bisecting can identify the exact commit that introduced the bug.

0 Kudos

2,350 Views
mikegilorma
Contributor II

I agree that the problem of corrupt packets is not reproducible on the same hardware running the older Freescale kernel (3.0.35).  However, I am still seeing dropped packets on the rx side of an iperf3 test with the older kernel, the failure mode is just different.  The packets do not show up as drops in ifconfig, but are definitely not getting to the other side of the link.  This is very repeatable in my setup. 

The same setup running the same exact test using the Congatec board as the source of the iperf3 traffic does not drop packets.

0 Kudos

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

We already test on the i.MX7D Sabre b does not show any errors. In his test he was using 16.04 LTS and used the iperf3 downloaded via 'sudo apt-get install iperf3' . The test exercised both ethernet ports on the 7D at the same time, and there was no packet loss.

 

Can you specify what may be different from a stock setup and setup?

 

Are the using CAT-6 cabling?

 

Are they running on a manged LAN? If so the need to test on a unmanaged LAN.  On a managed LAN the UDP packets may be dropped because they are misidentified as a DDOS attack.

 

Is a network manager or a firewall running on either side? They need to disable the firewall and any network manager.

 

They need to be testing using our supported BSP.  4.1.15 is the latest NXP supported version.  Can they download it and test using 4.1.15 and report?

0 Kudos

2,350 Views
jamesbone
NXP TechSupport
NXP TechSupport

Hello Mike,

Regarding the the packet loss that you are seeing in the i.MX6 device, this is unfortunately an errata of the device,  that the performance of the GigaEthernet port it is around 400-700 MBps.   But What sound that really it is a problem it is what you are seeing in the i.MX7,  can you please provided some details on how to replicate the issue. What version of BSP are you using ?  I need to tested in a i.Mx7EVK and replicate the problem,  can you please explain the  ethernet sceneario, how it is attach to the HOST to see the packet loss, can you please give us the iperf data that you mention.

The i.MX7 has an improvement in the FEC port, this means that the errata of the Gigaethernet port has been corrected. But it is weird that you are seeing the  same scenerio in both devices.

Have a nice day sir!

Jaime

0 Kudos

2,350 Views
mikegilorma
Contributor II

I am not sure what version BSP I am using.  I am using the stock yocto images provided with the development boards that I received from Variscite.  How can I figure out the answer to your question?

Here is my test setup:

Variscite imx6 board running krogroth:
$cpufreq-set -g performance
$iperf3 -c 192.168.0.101 -u -b 80M -l 1470 -t 60 -P 5
Ubuntu based desktop PC
$./iperf3 -s -i 10
I downloaded version 3.1.6 of iperf3 so I could patch the code and monitor the dropped packets.  I added the following line to iperf_udp.c @ line 105:
 
            printf("wanted %i, got %i\n", sp->packet_count + 1, pcount);

 

 
This allowed me to track down the dropped packet in wireshark on the Ubuntu machine.  Iperf3 puts a packet counter in byte 50 of the packet which makes it easier to track down.
NOTE:  These are not actually dropped packets but rather malformed packets that show up on the the receive side in place of packets that should contain the iperf payload data.  The malformed packets always contain the following: 

"Copyright (C) 2007-2013 Freescale Semiconductor, Inc. All Rights Reserved"

Another interesting note is that the iMX7 board fails with a slightly different malformed packet:

"Copyright (C) 2007-2015 Freescale Semiconductor, Inc. All Rights Reserved"

0 Kudos

2,350 Views
felixradensky
Contributor IV

We can also reproduce this problem on Freescale SabreSD evaluation board (Automotive grade i.MX6Q processor running at 792Mhz). The BSP version used during tests is NXP Community BSP, fsl-image-machine-test-imx6qsabresd-20170221-22.rootfs.sdcard.gz, kernel version is 4.1.38. It usually takes 3-4 minutes to reproduce.

The commands used to reproduce the problem:

On SabreSD board:

iperf3 -u -c 192.168.1.101 -b 80M -l 1470 -t 360 -i 10  -P 5 -w 32M -A0

On Ubuntu 14.04 running kernel 4.4 and iperf 3.1.6

iperf3 -s -i 10 -A0

0 Kudos

2,350 Views
felixradensky
Contributor IV

BTW, FEC TX corruption bug with very similar symptoms is discussed in this thread:

Bug in drivers/net/ethernet/freescale/fec_main.c, TX is broken. In 4.0.0-rc3 

0 Kudos