When the sja1105 SPI transfer times out, the cpu freezes and triggers the watchdog to restart

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

When the sja1105 SPI transfer times out, the cpu freezes and triggers the watchdog to restart

935 Views
guohangxiang
Contributor III

    Based on previous topic: SJA1105Q cannot forward eth packages form the rgmii interface of the cpu's eth1 

    I use the ts2phc tool to pass the time of other phc to sja1105 phc:

ts2phc -f /root/ethernet/ts2phc.cfg &

ptp4l -i sw0p0 -i sw0p1 -i sw0p2 -i sw0p3 -f /root/ethernet/automotive-master-5515.cfg --tx_timestamp_timeout 30 &

 

root@zynqmp_ubuntu:~/ethernet# cat ts2phc.cfg 
#
# ts2phc config file to get it to behave like syncd to align
# time stamper to PHC device's 1-PPS signal.
#
# Example:
# ./ts2phc -m -q -f ts2phc.cfg
#
[global]
clock_servo pi
first_step_threshold 0.000000001
step_threshold 0.000000001
logging_level 7
max_frequency 0
verbose 1
use_syslog 0
free_running 0
ts2phc.pulsewidth 990000000
# time stamper, slave device
[/dev/ptp0]
ts2phc.channel 0
ts2phc.extts_polarity both
ts2phc.extts_correction 1050
#[/dev/ptp1]
#ts2phc.channel 0
#ts2phc.extts_polarity both
#ts2phc.extts_correction 1050
[/dev/ptp2]
ts2phc.master 1
ts2phc.channel 0

 

 After a few hours of normal operation, the kernel prints an error about SPI time out and then the cpu freezes.

 

35f15f64-3e94-4c02-9ccc-b2f908cd4e5f.jpg

This is bound to happen, except when I'm not running the sja1105 related applications (ts2phc and ptp4l).When I don't run the time synchronization application about sja1105, the system can run normally for more than 26 hours.

I tried looking for kernel patches and couldn't find a similar patch, so I can't tell where the problem is at the moment.

0 Kudos
2 Replies

919 Views
vladimir_oltean
NXP Employee
NXP Employee

The problem seems to start in the SPI controller driver (drivers/spi/spi-cadence.c for zynqmp, if I am not mistaken). The error "SPI transfer timed out" is printed from spi_transfer_wait(), when wait_for_completion_timeout(&ctlr->xfer_completion) times out. This will happen when the SPI driver does not call spi_finalize_current_transfer().

By looking at the code, I do not currently see how this would be possible.

Please try to isolate the problem and remove the sja1105 driver from the equation, see if it is possible to reproduce the SPI transfer timeout using the generic spidev driver (which exports SPI read/write access to user space), and the spidev_test application from the Linux kernel (https://elixir.bootlin.com/linux/latest/source/tools/spi/spidev_test.c).

 

$ echo spi2.0 > /sys/bus/spi/drivers/sja1105/unbind
$ echo spidev > /sys/bus/spi/devices/spi2.0/driver_override
$ echo spi2.0 > /sys/bus/spi/drivers/spidev/bind
$ ./spidev_test --device /dev/spidev2.0 --bpw 8 --size 256 --cpha --iter 10000000 --speed 10000000

 

Using the arguments given to spidev_test, you can change parameters such as buffer size and frequency. Please let me know if the problem reproduces without the sja1105 driver as well.

880 Views
guohangxiang
Contributor III

I use the spidev_test parameters you provided to test the two spi:

 

./spidev_test --device /dev/spidev2.0 --bpw 8 --size 256 --cpha --iter 10000000000 --speed 10000000
./spidev_test --device /dev/spidev2.1 --bpw 8 --size 256 --cpha --iter 10000000000 --speed 10000000

 

 Continued 25 hours of testing and did not reproduce the problem.

I'm guessing if it fails to acquire mutex multiple times because of spi timeout, putting the cpu into an uninterrupted sleep state?

0 Kudos