AnsweredAssumed Answered

ls1012ardb - throughput measurement under high load fails, eth2 stops working

Question asked by Peter Vollmer on Apr 24, 2017
Latest reply on Oct 12, 2017 by Michele Jr De Candia

Hi,

I am currently evaluating the LS1012 processor using the ls1012ardb board and try to do some throughput measurements with our smartbits network performance analyzer. I tried with my own LS1012A-SDK-20161230-yocto build and with a prebuilt openwrt image ( Build vls1012a_1.2.1 for LS1012A) with the same results. So far I am not able to do a full measurement because one of the two interfaces apparently stops working. Here is what I found.


My setup is always the same for the boards I test


# cat setup.sh
#!/bin/sh

LAN=eth0

WAN=eth2
ip link set $LAN up
ifconfig $LAN 172.18.1.1 netmask 255.255.0.0 up
ip link set $WAN up
ifconfig $WAN 172.19.1.1 netmask 255.255.0.0 up
echo 1 > /proc/sys/net/ipv4/ip_forward


The smartbits analyzer puts UDP packets (src port 5000, dst port 5000 ) bidirectionally through the lan and wan ports for 30 seconds and tries to find the maximum throughput while less then 0.5 % of the sent frames get lost:


smb (max 1Gbps) -> lan0 -> CPU -> wan -> smb
smb <- lan0 <- CPU <- wan <- smb (max 1 Gbps)


The iptables netfilter rules are empty (default rule ACCEPT) , only the conntrack entries for the UDP connection are used for fast forwarding of the frames.


I checked that the network interfaces are set up correctly before the throughput measurement starts. ICMP in both the lan and wan subnets works. The ARP entries of my test peers look okay:

root@OpenWrt:/# ip neigh
172.19.1.101 dev eth0 lladdr 00:0c:be:01:58:46 REACHABLE
172.18.1.254 dev eth2 lladdr 00:10:18:bb:b4:da REACHABLE


The first measurement even shows me a throughput of ~27400 frames per second (packet length 124 bytes, frames have been sent at rate of 370000 fps, and 92 percent get lost).


After (or during ) the first measurement however the eth2 interface stops working. All ARP entries relating to eth2 are stale and ICMP to my test peer in the lan subnet (172.18.1.254) does not work anymore:


root@OpenWrt:/# ip neigh
172.19.13.1 dev eth0 lladdr 00:00:08:00:00:01 REACHABLE
172.19.1.101 dev eth0 lladdr 00:0c:be:01:58:46 STALE
172.19.12.1 dev eth0 lladdr 00:00:07:00:00:01 REACHABLE
172.19.11.1 dev eth0 lladdr 00:00:06:00:00:01 REACHABLE
172.18.13.1 dev eth2 lladdr 00:00:04:00:00:01 STALE
172.19.10.1 dev eth0 lladdr 00:00:05:00:00:01 REACHABLE
172.18.12.1 dev eth2 lladdr 00:00:03:00:00:01 STALE
172.18.11.1 dev eth2 FAILED
172.18.10.1 dev eth2 lladdr 22:33:44:00:00:00 STALE
172.18.1.254 dev eth2 lladdr 00:10:18:bb:b4:da STALE
fe80::be30:5bff:fee5:f3cc dev eth0 lladdr bc:30:5b:e5:f3:cc STALE


The arp request for this address is sent (checked with tcpdump on the peer) and the answer is sent, but never arrives on the eth2 interface on the ls1012ardb board.

The rx counter of "ifconfig eth2" is not increased, however ethtool -S shows increasing counters indicating arriving

ARP packets (rx_broadcast) that are appenrently not handed over by the PFE:

 

root@OpenWrt:/# ethtool -S eth2 | grep "rx_"
rx_packets: 5575508
rx_broadcast: 1178
rx_multicast: 34
rx_crc_errors: 102296
rx_undersize: 0
rx_oversize: 0
rx_fragment: 6
rx_jabber: 0
rx_64byte: 1288
rx_65to127byte: 76
rx_128to255byte: 5574138
rx_256to511byte: 0
rx_512to1023byte: 0
rx_1024to2047byte: 0
rx_GTE2048byte: 0
rx_octets: 713761741
IEEE_rx_drop: 157
IEEE_rx_frame_ok: 4982309
IEEE_rx_crc: 102296
IEEE_rx_align: 0
IEEE_rx_macerr: 410986
IEEE_rx_fdxfc: 0
IEEE_rx_octets_ok: 637652233


root@OpenWrt:/# ethtool -S eth2 | grep "rx_"
rx_packets: 5575546
rx_broadcast: 1213
rx_multicast: 36
rx_crc_errors: 102296
rx_undersize: 0
rx_oversize: 0
rx_fragment: 6
rx_jabber: 0
rx_64byte: 1324
rx_65to127byte: 77
rx_128to255byte: 5574139
rx_256to511byte: 0
rx_512to1023byte: 0
rx_1024to2047byte: 0
rx_GTE2048byte: 0
rx_octets: 713764278
IEEE_rx_drop: 157
IEEE_rx_frame_ok: 4982346
IEEE_rx_crc: 102296
IEEE_rx_align: 0
IEEE_rx_macerr: 410986
IEEE_rx_fdxfc: 0
IEEE_rx_octets_ok: 637654706

 

Any clues as to what is happening here ? The setup is really nothing special and I successfully checked a number of other network hardware with it.

 

Thanks and with best regards

Peter

Outcomes