Hello!
we have an issue here when communicating over FEC ethernet on the i.MX6. We would expect that if we run continuous stream of packets, the MX6 would be able to transfer them without dropping any of them. The problem is that we observe sporadic packet loss when both ends of the link operate in 1000 Mbps / full duplex mode. The packets are occasionally not transmitted FROM the FEC ethernet TOWARDS the host PC. We see first few dropped packets after roughly 5 hours of continuous transfer. After that, we see a few dropped packets every 2-3 hours. We produced exact steps to reproduce these packetloss issues, maybe someone has an idea? Thank you !
Steps:
HOSTPC: We have Intel i7 820QM with Intel i82577LM ethernet (e1000e driver) && Intel i7 3970X with i82579LM (e1000e driver) ethernet.
TARGET: We have MX6Q SabreAuto , MX6Q SabreLite and two custom boards, one with MX6Solo and other with MX6Dual. All use FEC ethernet for this test.
Any combination of TARGET and HOSTPC above have these symptoms. For your convenience, you can try with SabreAuto as the TARGET platform.
1) Connect TARGET directly through a 50cm CAT6 ethernet cable with a HOSTPC.
2) Boot Freescale Linux 3.0.35-4.1.0 (in default imx6_defconfig configuration for sabreauto, in slightly modified configuration for the custom mx6dual and mx6solo boards) on TARGET.
3) Boot the TARGET into userland on SD card, install "iperf" tool.
4) Make sure the link is in 1000/FD mode on HOST:
$ ethtool -s eth0 speed 1000 duplex full
5) Make sure the link is in 1000/FD mode on TARGET:
$ ethtool -s eth0 speed 1000 duplex full
6) Disable any possibly interfering network managers etc. on both ends:
$ /etc/init.d/networking stop
$ /etc/init.d/network-manager stop
7) Bring up network interface on HOSTPC:
$ ifconfig eth0 192.168.1.1 netmask 255.255.255.0
8) Bring up network interface on TARGET:
$ ifconfig eth0 192.168.1.2 netmask 255.255.255.0
9) Start "iperf" on HOSTPC in UDP server mode:
$ iperf -u -s -l 4M -i 60
10) Start "iperf" on TARGET in UDP client mode:
$ iperf -u -c 192.168.1.1 -t 28800 -b 1000M -i 60
Hi,
the problem is that the FEC ethernet swallows the packet, the packet is not emitted on the ethernet link. We are trying to figure out where this comes from. Is it a bug in the FEC or in the software ?
Hi Marek,
Yes, I understand that. I didn't understand why did you make problem of that. Because its probability soooo tiny, a real network program resends it, (if it was an important packet, at all) and case closed.
A packet might get lost for many reason:
- dog is chewing the cat6 network cable
- network plug is oxidized, point of contact is very bad
- bad weather, its lightens and transient pulses make trouble in your computer
- ethernet chip is over-heated, it starts to make errors
- sw bug in low level net drivers and/or op. system
- so on...I can list it forever, without end
Hardwares, (and softwares) make errors, this is their nature. If your FEC sends millions of packets without problem, and then it makes an error, that is not so bad at all, its error rate seems very low, actually its very good in my opinion. (You know what? Do calculate its error rate and compare that to the official FEC catalogue datas, you will be surprised). And its still absolutely not sure what is the origin of the error. It might be a faulty power-supply (PS errors are very sly), for example, not the FEC. Your real-life (not this artifical testing) network will have much more and bigger error sources I bet. And still, it will work, without any problem, with minor or negligible performance drops. I think you have nothing to worry about. If I were you I would test the whole networking application what I want to realize, and I would try to sift the weakest link out. Your FEC wont be the weakest link with this 1E-10 error probability, I'm perfectly sure.
I could be wrong tho :smileyhappy: