AnsweredAssumed Answered

'bad address' with i.MX25 and Linux 4.8 or later

Question asked by Timo Ketola on May 17, 2017
Latest reply on May 17, 2017 by igorpadykov

Dear i.MX Community,

 

We are trying to build Linux 4.8.17 vanilla kernel for i.MX25. Mostly everything seems to be fine but, when we put more network stress on it, it started to show strange behavior. Random processes not related in any way with network died with 'bad address' error.

 

The issue can be triggered quite easily with netcat and scp. We pipe a couple of netcats (see nctest script below) and then push net traffic from outside with scp. The script might run a few hundred rounds but mostly reports 'bad address' from nc or usleep in a few tens of rounds. netcat is not needed. This also triggers the issue:

# while true; do cat tiny.txt > /dev/null; done
-sh: cat: Bad address
-sh: cat: Bad address

...

 

We have not (yet) seen this issue with ethernet cable disconnected. So it feels like the fec-driver was part of the pattern.

 

The issue is there starting with 4.8-rc1 kernel. Any 4.7 kernel or older is good.

 

We bisected the kernel source and found this commit to be the culprit:

ARM: save and reset the address limit when entering an exception
e6978e4bf181fb3b5f8cb6f71b4fe30fbf1b655c

When we revert this commit on 4.8.17 the issue goes away.

 

What now? What would be the proper fix?

 

What other implications the reverting of the commit might cause?

 

nctest:

#!/bin/sh

i=0

while true; do
   nc -l -p $1 > /dev/null &
   usleep 25000
   echo Test | nc localhost $1 || break
   i=$((i+1))
done

 

echo $i rounds

 

Outcomes