i.MX6Q Ethernet: Low RX performance

cancel
Showing results for 
Search instead for 
Did you mean: 

i.MX6Q Ethernet: Low RX performance

2,387 Views
clemensgruber
Contributor III

Hi,

we are running Linux on a i.MX6Q with a Micrel KSZ9031RNX Gigabit Ethernet PHY but are experiencing performance problems when receiving data.

I read ERR004512 and think that this is a different issue. Transmitting data works fine, but when receiving the throughput maxes out at 136 Mbit/s. Could this be an indicator that some RX buffers are overflown? Is this a known issue and is there a workaround?

Have you any tips what I could do to improve the situation?

Transmit performance maxes out at about 397 Mbit/s at the moment (tested with iperf).

Kernel version: 3.19-rc7 (but we had similar results with version 3.16)


Might be related to: i.MX6Q ENET.REF_CLK input

Thanks.

Best regards,

Clemens Gruber

EDITED: Corrected the problem description. I made a mistake when measuring the throughput. The correct values are 136 Mbit RX and 397 Mbit TX.

8 Replies

508 Views
fabio_estevam
NXP Employee
NXP Employee

Maybe you could report this in the netdev@vger.kernel.org  mailing list.

508 Views
clemensgruber
Contributor III

I did that 2 weeks ago. Dave Taht replied:

I can think of a variety of things wrong, starting with aggressive power save on the board,

too high napi settings, lack of BQL, and no fq_codel enabled.

Fabio: Could you please share your guess on what could be the cause of these TX bottlenecks? Can you reproduce the 136 Mbit/s transmit bottleneck on your i.MX6Q boards?

Do you have any tips on how to further debug this problem?

We first planned our design with the KSZ9031RNX from Micrel but due to some problems and unsolved errata we are now using a Marvell 88E1510.

Reaching the 400mbps from the Freescale erratum would be great. But 136 mbps is just not good enough in our use case.

0 Kudos

508 Views
ranshalit
Senior Contributor I

Hi Clemens,

Thanks for the github comment !!

It seems I already using switch with flow control, but for some reason, I get no increase in performance .

The switch is HP  procurve 1410-8g.

I also thought of using "enable_wait_mode=off", but it seems to be working with only older kernels.

Maybe I will disable all power capability in imx ? It is a pitty there is no bootargs for this option.

Thank you for any idea,

Ran

0 Kudos

508 Views
Yuri
NXP TechSupport
NXP TechSupport

Please look at my comments below.

1.

  It makes sense to use the recent Freescale BSP .

http://www.freescale.com/webapp/Download?colCode=L3.10.53_1.1.0_iMX6QDLS_Bundle&appType=license&loca...

http://www.freescale.com/webapp/Download?colCode=L3.10.53_1.1.0_LINUX_DOCS

http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=IMX6_SW

2.

For some old BSP versions kernel boot parameter enable_wait_mode=off should be applied.

3.
As for ENET_REF_CLK – it is recommended to use it as input clock : an external

source should be applied, the same for both i.MX6 and KSZ9031.


Have a great day,
Yuri

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

508 Views
clemensgruber
Contributor III

Hi,

thanks for your response. I repeated the measurements with the BSP and achieved the following results:

RX: 135 Mbit/s

TX: 364 Mbit/s

After that, I repeated the iperf measurements with our current 3.19-rc7 kernel and the results are similar: 136 Mbit/s RX, 397 Mbit/s TX

Bottom line: TX performance is OK and according to ERR004512 pretty close to the hardware limits. But what is the bottleneck for the RX performance and why does it max out at about 136 Mbit/s ?

2. We use newer kernel versions, so enable_wait_mode=off should not be necessary. (I tried it anyway, but it did not improve the situation)

3. Thank you. The KSZ9031 however has only a 25 MHz input and would then output the 125 MHz clock on the CLK_125_NDO pin, which should be connected to the ENET_REF_CLK, but due to the KSZ9031 erratum, which says that the CLK125_NDO signal has duty cycle variations on the falling edge, we can't use it for ENET_REF_CLK.

But they don't have to be synchronous, right? So it would be OK, to use two separate oscillators, one 25MHz oscillator for the KSZ9031 PHY and another one with 125MHz and frequency stability < 50ppm for the ENET_REF_CLK on the i.MX6?

Do you have an idea what/where the bottleneck for the 136 Mbit/s RX performance is?

I am also reading through the following blog post, looks like they were having similar problems: http://boundarydevices.com/i-mx6-ethernet/

0 Kudos

508 Views
Yuri
NXP TechSupport
NXP TechSupport

1.

  Strictly speaking, the reference clock should be the same for both MAC and PHY.

2.
Generally the i.MX6 ENET performance restriction is system one and relates to summary throughput Rx+Tx.

Regards,

Yuri.

0 Kudos

508 Views
clemensgruber
Contributor III

Hi,

Thanks for your answers.

1. The PHY datasheet suggests using a separate oscillator. gusarambula told me in Re: i.MX6Q ENET.REF_CLK input that ENET_REF_CLK does not have to be "in-sync" or the same as on the PHY.

The problem with using one clock signal for both the MAC and the KSZ9031 is that the PHY does not have a 125 MHz clock input. There is just one 25 MHz input (XI) and it creates the 125 MHz clock internally from that (CLK_125_NDO), with duty cycle variations however (see errata sheet). That's why we use a separate oscillator..

Would you recommend using a different RGMII PHY with the i.MX6?

2. Can you confirm this 136 Mbit/s RX restriction on your test boards with the i.MX6Q? I was not testing RX and TX at the same time, therefore I hoped to achieve much more than just 136 Mbit/s when receiving data.


Regards,

Clemens

0 Kudos

508 Views
Yuri
NXP TechSupport
NXP TechSupport

1.

  Using the same reference clock for both MAC and PHY really helps to avoid additional issues.

Perhaps You are right - it would be better to consider another PHY.

2.

  According to the erratum ERR004512 : The actual measured performance (Tx + Rx) in an optimized

environment is up to 400 Mbps".

The network performance of MX6 GbE

Gigabit Performance Ubuntu vs. Android JB4.3

Regards,

Yuri.