lx2160ardb: kernel reports problems during heavy disk I/O with PCIe/SATA adapters

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

lx2160ardb: kernel reports problems during heavy disk I/O with PCIe/SATA adapters

1,418 Views
daniel_klauer
Contributor I

Hello,

Currently I'm trying to do some benchmarking with the LX2160ARDB and SATA SSDs. For this I have two PCIe-4x-to-4-port-SATA3.0 cards with Marvel 88SE9230 RAID controllers, that are plugged into the LX2160ARDB's PCIe slots (J23 and J28). Each has 4 Samsung 860 EVO SSDs connected for a total of 8 SSDs. The testing is done with Linux built from Yocto, btrfs raid0 and fio.

Sometimes while writing data to the disks with fio I'm seeing various kernel errors, and noticable delays/hangs which interrupt the writing process for multiple seconds or even minutes. Some examples:

root@lx2160ardb:~# fio --name=x --directory=/mnt/btrfspool --numjobs=1 --size=3T --filesize=1G --nrfiles=3K --openfiles=1 --file_service_type=sequential --rw=write --ioengine=sync --bs=1M --direct=0 --fallocate=posix --zero_buffers=0 --write_bw_log=x
x: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.12-dirty
Starting 1 process
x: Laying out IO files (3072 files / total 3145728MiB)
Jobs: 1 (f=1): [W(1)][0.6%][

[ 1204.410448] mmc1: Timeout waiting for hardware cmd interrupt.
[ 1204.416186] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 1204.422615] mmc1: sdhci: Sys addr:  0x5cc80000 | Version:  0x00002202
[ 1204.429043] mmc1: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000000
[ 1204.435471] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000033
[ 1204.441899] mmc1: sdhci: Present:   0x01f80008 | Host ctl: 0x0000003c
[ 1204.448327] mmc1: sdhci: Power:     0x00000007 | Blk gap:  0x00000000
[ 1204.454755] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000208
[ 1204.461183] mmc1: sdhci: Timeout:   0x0000000e | Int stat: 0x00000001
[ 1204.467610] mmc1: sdhci: Int enab:  0x037f100f | Sig enab: 0x037f100b
[ 1204.474038] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
[ 1204.480465] mmc1: sdhci: Caps:      0x34fa0000 | Caps_1:   0x0000af00
[ 1204.486893] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
[ 1204.493321] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0xffffff8d
[ 1204.499748] mmc1: sdhci: Resp[2]:   0x320f5903 | Resp[3]:  0x00000900
[ 1204.506175] mmc1: sdhci: Host ctl2: 0x00000080
[ 1204.510607] mmc1: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x00000000f9c9820c
[ 1204.517729] mmc1: sdhci: ============================================

[...]

[ 3565.222434] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:   
[ 3565.228521] rcu:     (detected by 14, t=5252 jiffies, g=610161, q=2482)
[ 3565.234867] rcu: All QSes seen, last rcu_preempt kthread activity 5250 (4295783547-4295778297), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 3565.247284] kworker/u32:27  R  running task        0  3372      2 0x0000002a
[ 3565.254333] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 3565.260414] Call trace:
[ 3565.262851]  dump_backtrace+0x0/0x158
[ 3565.266502]  show_stack+0x14/0x20
[ 3565.269805]  sched_show_task+0x13c/0x168
[ 3565.273716]  rcu_check_callbacks+0x7e0/0x850
[ 3565.277975]  update_process_times+0x2c/0x70
[ 3565.282147]  tick_sched_handle.isra.5+0x3c/0x50
[ 3565.286664]  tick_sched_timer+0x48/0x98
[ 3565.290487]  __hrtimer_run_queues+0x118/0x1a8
[ 3565.294831]  hrtimer_interrupt+0xe4/0x240
[ 3565.298829]  arch_timer_handler_phys+0x2c/0x38
[ 3565.303262]  handle_percpu_devid_irq+0x80/0x138
[ 3565.307779]  generic_handle_irq+0x24/0x38
[ 3565.311776]  __handle_domain_irq+0x60/0xb8
[ 3565.315860]  gic_handle_irq+0x7c/0x178
[ 3565.319596]  el1_irq+0xb0/0x128
[ 3565.322726]  queued_spin_lock_slowpath+0x230/0x2a8
[ 3565.327505]  queued_read_lock_slowpath+0x118/0x120
[ 3565.332285]  _raw_read_lock+0x44/0x48
[ 3565.335935]  btrfs_tree_read_lock+0x40/0x130
[ 3565.340193]  btrfs_search_slot+0x6d4/0x8a0
[ 3565.344278]  btrfs_lookup_csum+0x5c/0x188
[ 3565.348275]  btrfs_csum_file_blocks+0x214/0x580
[ 3565.352794]  add_pending_csums+0x64/0x98
[ 3565.356705]  btrfs_finish_ordered_io+0x2c0/0x810
[ 3565.361309]  finish_ordered_fn+0x10/0x18
[ 3565.365219]  normal_work_helper+0x228/0x240
[ 3565.369390]  btrfs_endio_write_helper+0x10/0x18
[ 3565.373909]  process_one_work+0x1e0/0x318
[ 3565.377907]  worker_thread+0x40/0x428
[ 3565.381557]  kthread+0x124/0x128
[ 3565.384772]  ret_from_fork+0x10/0x18
[ 3565.388337] rcu: rcu_preempt kthread starved for 5250 jiffies! g610161 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=7
[ 3565.398843] rcu: RCU grace-period kthread stack dump:
[ 3565.403881] rcu_preempt     R    0    10      2 0x00000028
[ 3565.409355] Call trace:
[ 3565.411789]  __switch_to+0xa0/0xe0
[ 3565.415179]  __schedule+0x1e0/0x5c0
[ 3565.418655]  schedule+0x38/0xa0
[ 3565.421784]  schedule_timeout+0x198/0x338
[ 3565.425783]  rcu_gp_kthread+0x428/0x800
[ 3565.429607]  kthread+0x124/0x128
[ 3565.432822]  ret_from_fork+0x10/0x18

[...]

[ 4510.522428] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 4510.528509] rcu:     (detected by 15, t=241578 jiffies, g=610161, q=96607)
[ 4510.535111] rcu: All QSes seen, last rcu_preempt kthread activity 241578 (4296019875-4295778297), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 4510.547701] swapper/15      R  running task        0     0      1 0x00000028

[...]

an example from another run:

[ 4261.539921] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 4261.546349] mmc1: sdhci: Sys addr:  0x5c580000 | Version:  0x00002202
[ 4261.552777] mmc1: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000008
[ 4261.559204] mmc1: sdhci: Argument:  0x00010000 | Trn mode: 0x00000033
[ 4261.565631] mmc1: sdhci: Present:   0x01f80008 | Host ctl: 0x0000003c
[ 4261.572059] mmc1: sdhci: Power:     0x00000007 | Blk gap:  0x00000000
[ 4261.578487] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000208
[ 4261.584914] mmc1: sdhci: Timeout:   0x0000000e | Int stat: 0x00000001
[ 4261.591342] mmc1: sdhci: Int enab:  0x037f100f | Sig enab: 0x037f100b
[ 4261.597770] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
[ 4261.604197] mmc1: sdhci: Caps:      0x34fa0000 | Caps_1:   0x0000af00
[ 4261.610625] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
[ 4261.617052] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x0000000d
[ 4261.623480] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
[ 4261.629907] mmc1: sdhci: Host ctl2: 0x00000080
[ 4261.634339] mmc1: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x00000000f9c9820c
[ 4261.641459] mmc1: sdhci: ============================================
[ 4261.648429] mmc1: card 0001 removed
[ 4271.982233] ata11.00: exception Emask 0x0 SAct 0xffc00bff SErr 0x0 action 0x6 frozen
[ 4271.989975] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4271.995330] ata11.00: cmd 61/80:00:80:4a:9c/00:00:2f:00:00/40 tag 0 ncq dma 65536 out
[ 4271.995330]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4271.996397] sd 3:0:0:0: [sdd] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.010534] ata11.00: status: { DRDY }
[ 4272.010537] ata11.00: failed command: WRITE FPDMA QUEUED

[ 4272.010542] ata11.00: cmd 61/00:08:00:45:9c/02:00:2f:00:00/40 tag 1 ncq dma 262144 out
[ 4272.010542]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

[ 4272.010544] ata11.00: status: { DRDY }

fio: io_u error on file /mnt/btrfspool/x.0.3042: Input/output error: write offset=195035136, buflen=1048576
[ 4272.010546] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010550] ata11.00: cmd 61/00:10:00:48:9c/01:00:2f:00:00/40 tag 2 ncq dma 131072 out
[ 4272.010550]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010552] ata11.00: status: { DRDY }

[ 4272.010554] ata11.00: failed command: WRITE FPDMA QUEUED

[ 4272.010558] ata11.00: cmd 61/00:18:00:4b:9c/01:00:2f:00:00/40 tag 3 ncq dma 131072 out

fio: io_u error on file /mnt/btrfspool/x.0.3042: Input/output error: write offset=240123904, buflen=1048576
[ 4272.010558]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010559] ata11.00: status: { DRDY }
[ 4272.010561] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010565] ata11.00: cmd 61/80:20:80:44:9c/00:00:2f:00:00/40 tag 4 ncq dma 65536 out
[ 4272.010565]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010566] ata11.00: status: { DRDY }
[ 4272.010568] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010573] ata11.00: cmd 61/00:28:00:4c:9c/01:00:2f:00:00/40 tag 5 ncq dma 131072 out
[ 4272.010573]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010574] ata11.00: status: { DRDY }
[ 4272.010579] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.011649] sd 2:0:0:0: [sdc] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011654] sd 2:0:0:0: [sdc] tag#31 CDB: opcode=0x2a 2a 00 2f 9b e8 80 00 00 80 00
[ 4272.011657] print_req_error: I/O error, dev sdc, sector 798746752
[ 4272.011659] sd 1:0:0:0: [sdb] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011664] BTRFS error (device sda): bdev /dev/sdc errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 4272.011665] sd 1:0:0:0: [sdb] tag#31 CDB: opcode=0x2a 2a 00 2f 9b e9 80 00 00 80 00
[ 4272.011669] BTRFS warning (device sda): direct IO failed ino 4323 rw 1,34817 sector 0x2f9be900 len 0 err no 10
[ 4272.011671] print_req_error: I/O error, dev sdb, sector 798747008
[ 4272.011676] BTRFS error (device sda): bdev /dev/sdb errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 4272.011680] BTRFS warning (device sda): direct IO failed ino 4323 rw 1,34817 sector 0x2f9bea00 len 0 err no 10
[ 4272.011705] sd 2:0:0:0: [sdc] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011708] sd 2:0:0:0: [sdc] tag#30 CDB: opcode=0x2a 2a 00 2f 9b d2 00 00 0a 00 00
[ 4272.011710] print_req_error: I/O error, dev sdc, sector 798740992
[ 4272.011713] BTRFS error (device sda): bdev /dev/sdc errs: wr 2, rd 0, flush 0, corrupt 0, gen 0

Some additional information that might be useful:

  • I started out with the Yocto state corresponding to LSDK-19.03 (as per yocto_2.7 tag in the qoriq-components/yocto-sdk repo). However this included the linux-qoriq kernel 4.19.26+gc0c214110624 which crashed at boot in ahci-qoriq driver's fsl_sata_errata_379364() function. That was fixed by upgrading to kernel 4.19.46+g1a4cab2c597d from LSDK-19.06. (some other components such as u-boot and management-complex firmware had to be upgraded to LSDK-19.06 state too to make the new kernel work)
  • I did not see such problems during disk I/O with only 4 SSDs connected directly to the board's SATA connectors. Also no problems with 4 SSDs connected to only 1 PCIe card, in either slot, as opposed to using 2 PCIe cards.
  • The same problems appeared when using 2 PCIe cards with 2 SSDs each (total 4 SSDs instead of 8).
  • We tried powering half of the SSDs with an external power supply, in case the internal one was too weak for 8 SSDs, but it doesn't seem to make a difference - errors appeared either way.

Does anyone have an idea what the problem could be? I'm pretty much lost at this point. It would be really nice to get the 8 SSDs to work reliably, but so far, no luck. Probably the way forward is to do more tests to rule out hardware problems with the PCIe cards or SSDs themselves.

But I don't understand why doing disk I/O would cause mmc1 timeouts. Yes, I'm booting from SD card, but that's mmc0 - mmc1 is (as far as I know) the internal eMMC which isn't being used at all (when mounting the btrfs raid0, only /dev/sd[a-h] are used).

Thanks in advance for any hints.

0 Kudos
2 Replies

684 Views
mdecandia
Contributor III

Hi all, was this suggestion useful to solve the issue? 

 

I've a similar issue with a LS1046 and mSATA SSD, where we face sporadic I/O Error on SATA bus traced as follows:

---

Sep 27 12:22:01 OTN kernel: [3109812.437650] sd 0:0:0:0: [sda] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
Sep 27 12:22:01 OTN kernel: [3109812.437664] sd 0:0:0:0: [sda] tag#28 CDB: opcode=0x2a 2a 00 01 50 a1 cd 00 0a 00 00
Sep 27 12:22:01 OTN kernel: [3109812.437676] blk_update_request: I/O error, dev sda, sector 22061517 op 0x1:(WRITE) flags 0x4000 phys_seg 46 prio class 0

 

....

 

that does not match with SSD status. There are no badblocks or error on disk.

 

Do you have any suggestion?

 

Thanks,

Michele

0 Kudos

1,100 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello Daniel Klauer,

This issue could be caused by many reasons. It should be related to NCQ.

It is more likely to happen when system is under heavy load.

You can try the followings:

  1. Try another disk to check if it was caused by disk. Different disk may behave differently. Some disk even doesnt support NCQ.
  2. Disable NCQ by: echo 1 > /sys/block/sda/device/queue_depth or Here sda may need change according to your system. Pass libata.force=noncq to your bootargs.

 

BTW:

  1. If NCQ was disabled, the performance will be impacted.
  2. Even though such error occurred, there is no data lost on disk according to my test.

Thanks,

Yiping

0 Kudos