Hello,
Currently I'm trying to do some benchmarking with the LX2160ARDB and SATA SSDs. For this I have two PCIe-4x-to-4-port-SATA3.0 cards with Marvel 88SE9230 RAID controllers, that are plugged into the LX2160ARDB's PCIe slots (J23 and J28). Each has 4 Samsung 860 EVO SSDs connected for a total of 8 SSDs. The testing is done with Linux built from Yocto, btrfs raid0 and fio.
Sometimes while writing data to the disks with fio I'm seeing various kernel errors, and noticable delays/hangs which interrupt the writing process for multiple seconds or even minutes. Some examples:
root@lx2160ardb:~# fio --name=x --directory=/mnt/btrfspool --numjobs=1 --size=3T --filesize=1G --nrfiles=3K --openfiles=1 --file_service_type=sequential --rw=write --ioengine=sync --bs=1M --direct=0 --fallocate=posix --zero_buffers=0 --write_bw_log=x
x: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
fio-3.12-dirty
Starting 1 process
x: Laying out IO files (3072 files / total 3145728MiB)
Jobs: 1 (f=1): [W(1)][0.6%][[ 1204.410448] mmc1: Timeout waiting for hardware cmd interrupt.
[ 1204.416186] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 1204.422615] mmc1: sdhci: Sys addr: 0x5cc80000 | Version: 0x00002202
[ 1204.429043] mmc1: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000000
[ 1204.435471] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000033
[ 1204.441899] mmc1: sdhci: Present: 0x01f80008 | Host ctl: 0x0000003c
[ 1204.448327] mmc1: sdhci: Power: 0x00000007 | Blk gap: 0x00000000
[ 1204.454755] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000208
[ 1204.461183] mmc1: sdhci: Timeout: 0x0000000e | Int stat: 0x00000001
[ 1204.467610] mmc1: sdhci: Int enab: 0x037f100f | Sig enab: 0x037f100b
[ 1204.474038] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
[ 1204.480465] mmc1: sdhci: Caps: 0x34fa0000 | Caps_1: 0x0000af00
[ 1204.486893] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
[ 1204.493321] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0xffffff8d
[ 1204.499748] mmc1: sdhci: Resp[2]: 0x320f5903 | Resp[3]: 0x00000900
[ 1204.506175] mmc1: sdhci: Host ctl2: 0x00000080
[ 1204.510607] mmc1: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000f9c9820c
[ 1204.517729] mmc1: sdhci: ============================================[...]
[ 3565.222434] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 3565.228521] rcu: (detected by 14, t=5252 jiffies, g=610161, q=2482)
[ 3565.234867] rcu: All QSes seen, last rcu_preempt kthread activity 5250 (4295783547-4295778297), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 3565.247284] kworker/u32:27 R running task 0 3372 2 0x0000002a
[ 3565.254333] Workqueue: btrfs-endio-write btrfs_endio_write_helper
[ 3565.260414] Call trace:
[ 3565.262851] dump_backtrace+0x0/0x158
[ 3565.266502] show_stack+0x14/0x20
[ 3565.269805] sched_show_task+0x13c/0x168
[ 3565.273716] rcu_check_callbacks+0x7e0/0x850
[ 3565.277975] update_process_times+0x2c/0x70
[ 3565.282147] tick_sched_handle.isra.5+0x3c/0x50
[ 3565.286664] tick_sched_timer+0x48/0x98
[ 3565.290487] __hrtimer_run_queues+0x118/0x1a8
[ 3565.294831] hrtimer_interrupt+0xe4/0x240
[ 3565.298829] arch_timer_handler_phys+0x2c/0x38
[ 3565.303262] handle_percpu_devid_irq+0x80/0x138
[ 3565.307779] generic_handle_irq+0x24/0x38
[ 3565.311776] __handle_domain_irq+0x60/0xb8
[ 3565.315860] gic_handle_irq+0x7c/0x178
[ 3565.319596] el1_irq+0xb0/0x128
[ 3565.322726] queued_spin_lock_slowpath+0x230/0x2a8
[ 3565.327505] queued_read_lock_slowpath+0x118/0x120
[ 3565.332285] _raw_read_lock+0x44/0x48
[ 3565.335935] btrfs_tree_read_lock+0x40/0x130
[ 3565.340193] btrfs_search_slot+0x6d4/0x8a0
[ 3565.344278] btrfs_lookup_csum+0x5c/0x188
[ 3565.348275] btrfs_csum_file_blocks+0x214/0x580
[ 3565.352794] add_pending_csums+0x64/0x98
[ 3565.356705] btrfs_finish_ordered_io+0x2c0/0x810
[ 3565.361309] finish_ordered_fn+0x10/0x18
[ 3565.365219] normal_work_helper+0x228/0x240
[ 3565.369390] btrfs_endio_write_helper+0x10/0x18
[ 3565.373909] process_one_work+0x1e0/0x318
[ 3565.377907] worker_thread+0x40/0x428
[ 3565.381557] kthread+0x124/0x128
[ 3565.384772] ret_from_fork+0x10/0x18
[ 3565.388337] rcu: rcu_preempt kthread starved for 5250 jiffies! g610161 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=7
[ 3565.398843] rcu: RCU grace-period kthread stack dump:
[ 3565.403881] rcu_preempt R 0 10 2 0x00000028
[ 3565.409355] Call trace:
[ 3565.411789] __switch_to+0xa0/0xe0
[ 3565.415179] __schedule+0x1e0/0x5c0
[ 3565.418655] schedule+0x38/0xa0
[ 3565.421784] schedule_timeout+0x198/0x338
[ 3565.425783] rcu_gp_kthread+0x428/0x800
[ 3565.429607] kthread+0x124/0x128
[ 3565.432822] ret_from_fork+0x10/0x18[...]
[ 4510.522428] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 4510.528509] rcu: (detected by 15, t=241578 jiffies, g=610161, q=96607)
[ 4510.535111] rcu: All QSes seen, last rcu_preempt kthread activity 241578 (4296019875-4295778297), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 4510.547701] swapper/15 R running task 0 0 1 0x00000028[...]
an example from another run:
[ 4261.539921] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 4261.546349] mmc1: sdhci: Sys addr: 0x5c580000 | Version: 0x00002202
[ 4261.552777] mmc1: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000008
[ 4261.559204] mmc1: sdhci: Argument: 0x00010000 | Trn mode: 0x00000033
[ 4261.565631] mmc1: sdhci: Present: 0x01f80008 | Host ctl: 0x0000003c
[ 4261.572059] mmc1: sdhci: Power: 0x00000007 | Blk gap: 0x00000000
[ 4261.578487] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000208
[ 4261.584914] mmc1: sdhci: Timeout: 0x0000000e | Int stat: 0x00000001
[ 4261.591342] mmc1: sdhci: Int enab: 0x037f100f | Sig enab: 0x037f100b
[ 4261.597770] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00002202
[ 4261.604197] mmc1: sdhci: Caps: 0x34fa0000 | Caps_1: 0x0000af00
[ 4261.610625] mmc1: sdhci: Cmd: 0x00000d1a | Max curr: 0x00000000
[ 4261.617052] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x0000000d
[ 4261.623480] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 4261.629907] mmc1: sdhci: Host ctl2: 0x00000080
[ 4261.634339] mmc1: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000f9c9820c
[ 4261.641459] mmc1: sdhci: ============================================
[ 4261.648429] mmc1: card 0001 removed
[ 4271.982233] ata11.00: exception Emask 0x0 SAct 0xffc00bff SErr 0x0 action 0x6 frozen
[ 4271.989975] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4271.995330] ata11.00: cmd 61/80:00:80:4a:9c/00:00:2f:00:00/40 tag 0 ncq dma 65536 out
[ 4271.995330] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4271.996397] sd 3:0:0:0: [sdd] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.010534] ata11.00: status: { DRDY }
[ 4272.010537] ata11.00: failed command: WRITE FPDMA QUEUED[ 4272.010542] ata11.00: cmd 61/00:08:00:45:9c/02:00:2f:00:00/40 tag 1 ncq dma 262144 out
[ 4272.010542] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)[ 4272.010544] ata11.00: status: { DRDY }
fio: io_u error on file /mnt/btrfspool/x.0.3042: Input/output error: write offset=195035136, buflen=1048576
[ 4272.010546] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010550] ata11.00: cmd 61/00:10:00:48:9c/01:00:2f:00:00/40 tag 2 ncq dma 131072 out
[ 4272.010550] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010552] ata11.00: status: { DRDY }[ 4272.010554] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010558] ata11.00: cmd 61/00:18:00:4b:9c/01:00:2f:00:00/40 tag 3 ncq dma 131072 out
fio: io_u error on file /mnt/btrfspool/x.0.3042: Input/output error: write offset=240123904, buflen=1048576
[ 4272.010558] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010559] ata11.00: status: { DRDY }
[ 4272.010561] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010565] ata11.00: cmd 61/80:20:80:44:9c/00:00:2f:00:00/40 tag 4 ncq dma 65536 out
[ 4272.010565] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010566] ata11.00: status: { DRDY }
[ 4272.010568] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.010573] ata11.00: cmd 61/00:28:00:4c:9c/01:00:2f:00:00/40 tag 5 ncq dma 131072 out
[ 4272.010573] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 4272.010574] ata11.00: status: { DRDY }
[ 4272.010579] ata11.00: failed command: WRITE FPDMA QUEUED
[ 4272.011649] sd 2:0:0:0: [sdc] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011654] sd 2:0:0:0: [sdc] tag#31 CDB: opcode=0x2a 2a 00 2f 9b e8 80 00 00 80 00
[ 4272.011657] print_req_error: I/O error, dev sdc, sector 798746752
[ 4272.011659] sd 1:0:0:0: [sdb] tag#31 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011664] BTRFS error (device sda): bdev /dev/sdc errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 4272.011665] sd 1:0:0:0: [sdb] tag#31 CDB: opcode=0x2a 2a 00 2f 9b e9 80 00 00 80 00
[ 4272.011669] BTRFS warning (device sda): direct IO failed ino 4323 rw 1,34817 sector 0x2f9be900 len 0 err no 10
[ 4272.011671] print_req_error: I/O error, dev sdb, sector 798747008
[ 4272.011676] BTRFS error (device sda): bdev /dev/sdb errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 4272.011680] BTRFS warning (device sda): direct IO failed ino 4323 rw 1,34817 sector 0x2f9bea00 len 0 err no 10
[ 4272.011705] sd 2:0:0:0: [sdc] tag#30 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
[ 4272.011708] sd 2:0:0:0: [sdc] tag#30 CDB: opcode=0x2a 2a 00 2f 9b d2 00 00 0a 00 00
[ 4272.011710] print_req_error: I/O error, dev sdc, sector 798740992
[ 4272.011713] BTRFS error (device sda): bdev /dev/sdc errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
Some additional information that might be useful:
Does anyone have an idea what the problem could be? I'm pretty much lost at this point. It would be really nice to get the 8 SSDs to work reliably, but so far, no luck. Probably the way forward is to do more tests to rule out hardware problems with the PCIe cards or SSDs themselves.
But I don't understand why doing disk I/O would cause mmc1 timeouts. Yes, I'm booting from SD card, but that's mmc0 - mmc1 is (as far as I know) the internal eMMC which isn't being used at all (when mounting the btrfs raid0, only /dev/sd[a-h] are used).
Thanks in advance for any hints.
Hi all, was this suggestion useful to solve the issue?
I've a similar issue with a LS1046 and mSATA SSD, where we face sporadic I/O Error on SATA bus traced as follows:
---
Sep 27 12:22:01 OTN kernel: [3109812.437650] sd 0:0:0:0: [sda] tag#28 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x06
Sep 27 12:22:01 OTN kernel: [3109812.437664] sd 0:0:0:0: [sda] tag#28 CDB: opcode=0x2a 2a 00 01 50 a1 cd 00 0a 00 00
Sep 27 12:22:01 OTN kernel: [3109812.437676] blk_update_request: I/O error, dev sda, sector 22061517 op 0x1:(WRITE) flags 0x4000 phys_seg 46 prio class 0
....
that does not match with SSD status. There are no badblocks or error on disk.
Do you have any suggestion?
Thanks,
Michele
Hello Daniel Klauer,
This issue could be caused by many reasons. It should be related to NCQ.
It is more likely to happen when system is under heavy load.
You can try the followings:
BTW:
Thanks,
Yiping