QorIQ Community Articles

cancel
Showing results for 
Search instead for 
Did you mean: 

QorIQ Community Articles

NXP Employee
NXP Employee

SPDK on Layerscape

SPDK (Storage Performance Development Kit) is an optimized storage reference architecture. It is initiated and developed by Intel.

SPDK provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode, like DPDK.


Background

  • Hard drive latency is dramatically dropping down: HDD(SAS/SATA) ~10ms → SSD (SATA) ~0.1ms → SSD (NVMe) ~0.075ms
  • Bus width and command queue is increasing: SAS/SATA 6Gbps, 32 commands/queue → NVMe 24Gbps, 64k commands/queue
  • Network bandwidth is increasing: 1Gbps → 10Gbps → 40Gbps → 100Gbps

All these changes make software latency the major contributor to the whole latency stack in nowadays.


Architecture and subcomponents

Two key changes for SPDK to reduce latency caused by software stack:

  • Poll mode driver: Submits the request for a read or write, and then goes off to do other work, checking back at some interval to see if the I/O has yet been completed. This avoids the latency and overhead of using interrupts and allows the application to improve I/O efficiency
  • User space data process: Avoiding the kernel context switches and interrupts saves a significant amount of processing overhead, allowing more cycles to be spent doing the actual storing of the data.

Following is the software stack of SPDK:

pastedImage_1.png


Subcomponents

NVMe Driver

   lib/nvme

Provides direct, zero-copy data transfer to and from NVMe SSDs. It controls NVMe devices by directly mapping the PCI BAR into the local process and performing MMIO. I/O is submitted asynchronously via queue pairs.

NVMe over Fabrics Target

   lib/nvmf

User space application that presents block devices over the network using RDMA. It requires an RDMA-capable NIC with its corresponding OFED software package installed to run. 

iSCSI Target

   lib/iscsi

Implementation of the established specification for block traffic over Ethernet. Current version uses the kernel TCP/IP stack by default.

Block Device Abstraction Layer

   lib/bdev

This generic block device abstraction is the glue that connects the storage protocols to the various device drivers and block devices. Also provides flexible APIs for additional customer functionality (RAID, compression, dedup, and so on) in the block layer.

It defines:

  • a driver module API for implementing bdev drivers
  • an application API for enumerating and claiming SPDK block devices and performance operations
  • bdev drivers for NVMe, malloc (ramdisk), Linux AIO and Ceph RBD

Blobstore

   lib/blob

A persistent, power-fail safe block allocator designed to be used as the local storage system backing a higher level storage service, typically in lieu of a traditional filesystem.

This is a virtual device that VMs or databases could interact with.

BlobFS

   lib/blobfs

Adds basic filesystem functionality like filenames on top of the blobstore.

vhost

   lib/vhost

It extends SPDK to present virtio storage controllers to QEMU-based VMs and process I/O submitted to devices attached to those controllers

Event framework

   lib/event

A framework for writing asynchronous, polled-mode, shared-nothing server applications.

The event framework is intended to be optional; most other SPDK components are designed to be integrated into an application without specifically depending on the SPDK event library. The framework defines several concepts - reactors, events, and pollers.


Build and Test

General guides can be found here and here.

SPDK build/deployment is tested on LS2088.


Environment Setup

SW

  • OS: Ubuntu 18.04.2 LTS
  • SPDK: 43727fb7e5c@master branch
  • DPDK: 18.11

HW

  • LS2088A-RDB platform
  • INTEL SSDPED1D280GA NVMe SSD card with firmware version of E2010325


Build

DPDK

    # git clone git://dpdk.org/dpdk

    # export RTE_TARGET=arm64-dpaa2-linuxapp-gcc

    # export RTE_SDK=/code/dpdk

    # make T=arm64-dpaa-linuxapp-gcc CONFIG_RTE_KNI_KMOD=n CONFIG_RTE_LIBRTE_PPFE_PMD=n

       CONFIG_RTE_EAL_IGB_UIO=n install -j 4

SPDK

     # git clone https://github.com/spdk/spdk

    # cd spdk

    # sudo ./scripts/pkgdep.sh

     #./configure –with-dpdk=/code/dpdk/arm64-dpaa2-linuxapp-gcc

     # make -j8


Deploy

check NVMe status

    # sudo lspci -vn | sed -n '/NVM Express/,/^$/p'

You should see lines like

pastedImage_39.png

Deploy SPDK

UIO

   # modprobe uio
   # modprobe uio_pci_generic
   # echo -n "8086 2700 8086 3900" > /sys/bus/pci/drivers/uio_pci_generic/new_id
   # echo -n "0000:01:00.0" > /sys/bus/pci/drivers/nvme/unbind
   # echo -n "0000:01:00.0" > /sys/bus/pci/drivers/uio_pci_generic/bind

VFIO

   # modprobe vfio-pci

   # cd <SPDK_ROOT_DIR>

   # ./scripts/setup.sh


Test

   # sudo ./examples/nvme/identify/identify

This app should give you the detail disk info of attached NVMe storage.

   # sudo ./examples/nvme/perf/perf -q 128 -s 4096 -w write -t 60 -c 0xFF -o 2048 -r 'trtype:PCIe traddr:0000:01:00.0'
This will give SPDK performance data.
With prior described HW/SW settings, following data are achieved (performance in MBps):

 

512B

2K

4K

8K

Rd

286

1082

1120

1461

Wr

117

458

1445

1137

 

Benchmark

FIO


Build FIO with SPDK

   # git clone https://github.com/axboe/fio --branch fio-3.3
   # cd fio
   # make


Build SPDK with FIO plugin support

   # cd spdk
   # ./configure --with-fio=<path-to-fio-src> --enable-debug
   # make DPDK_CONFIG=arm64-armv8a-linuxapp-gcc


Run FIO

   # cd fio
   # LD_PRELOAD=../spdk/examples/nvme/fio_plugin/fio_plugin ./fio --name=nvme --numjobs=1

      --filename="trtype=PCIe traddr=0000.01.00.0 ns=1" --bs=4K --iodepth=1

      --ioengine=../spdk/examples/nvme/fio_plugin/fio_plugin --direct=1 --sync=0 --norandommap --group_reporting

      --size=10% --runtime=3 -rwmixwrite=30 --thread=1 --rw=r

More
0 0 253
NXP Employee
NXP Employee

  • Disable hw_prefetch (u-boot):

setenv hwconfig 'fsl_ddr:bank_intlv=auto;core_prefetch:disable=0xFE'

qixis reset altbank (reset the board - in case using bank 0 run 'qixis reset' only)

 

  • bootargs or othbootargs - add below parameters to bootargs (u-boot).

                  Make sure you see the same in ‘cat /proc/cmdline’ once kernel is booted:

- use 1G hugepages:

default_hugepagesz=1024m hugepagesz=1024m hugepages=6 (or any number)

- isolate cpu's for user space (for the CPUs running DPDK without kernel interference):

isolcpus=1-7

- make sure no rcu stalls and watchdog prints:

nmi_watchdog=0 rcupdate.rcu_cpu_stall_suppress=1

 

  • Run enable performance script (kernel) – this will enable running all DPDK applications at RT priorities.

source /usr/local/dpdk/enable_performance_script.sh

(please make sure that you are not using core 0 in the DPDK coremask/lcores - i.e. the core, which is also running the Linux OS services)

 

  • In case you are also using some of the DPAA2 interfaces with kernel, affine all the DPIO portal interrupts to core 0, so no interrupts interfere with user-space threads (kernel).

cat /proc/interrupts (search for dpio interrupts and their corresponding irq numbers)

cat 0x1 > /proc/irq/<irq number>/smp_affinity (for enabling Core 0 to serve interrupts on DPIO)

Run above command for all the dpio portals

  • to achieve higher performance on a single interface, use multiple rx queue with packet distribution enabled across cores.

e.g.  For running testpmd in multiqueue mode:

on running testpmd use CLI option '--rxq=<x>' to create ‘x’ rx queues.

For 2 queues use --rxq=2 parameter. For e.g.

./testpmd -c 0x3 -n 1 -- -i --nb-cores=1 --portmask=0x10 --port-topology=chained --rxq=2

  Note 1: default l2fwd example application does not support multiqueues with packet distribution. 

  Note 2:  In case of multiple queues, use adequate number of flows per port (e.g 1K flows per port) so flows can evenly distribute across cores. 

More
0 0 536
NXP Employee
NXP Employee

How to Uboot...  I thought I would write this up as many developers using Layerscape, QoriQ and Qonverge devices will start with a boot loader as the first access to their own newly minted hardware.  There are two paths here.  The first is to get our SDK and find the uboot source in that, modify it as needed.  This is time consuming as you need to build an image to have Yocto pull the source code, and you need to jump through some hoops to rebuild with yocto after making your own custom uboot. 

The second is to go straight to the git repo, pull it and build with the cross compiler toolchain that seems most appropriate.  This can be easier in general...  To do this:

Step 1. Install your tools! 

Cross-compiler

For example when compiling for ARM: 

Go here and find a specific version, download and untar/zip

https://releases.linaro.org/components/toolchain/binaries/latest/ 

or:

$ sudo apt-get install libc6-armel-cross libc6-dev-armel-cross binutils-arm-linux-gnueabi libncurses5-dev
$ sudo apt-get install gcc-arm-linux-gnueabi

For example when compiling for Power Architecture: 

$ sudo apt-get install gcc-4.8-powerpc-linux-gnu g++-4.8-powerpc-linux-gnu binutils-4.8-powerpc-linux-gnu

Device Tree Compiler (Uboot builds use this also)
$ sudo apt-get install device-tree-compiler

Or
$ git clone git://git.kernel.org/pub/scm/utils/dtc/dtc.git
$ cd dtc
$ make

Step 2.  Pull the latest uboot Source latest from the repo:

$  git clone -b master git://git.denx.de/u-boot.git

Step 3.  set up your build environment

Add the DTC tools to your path (below shows where they are in a yocto install, your path would be different)

$ export PATH=$PATH:/home/michelle/Work/QorIQ-SDK-V2.0-20160527-yocto/build_ls2080ardb/tmp/sysroots/x86_64-linux/usr/bin/

Add the cross compiler location to your path (below shows a typical path, your path will depend where you install the tools):

$export PATH=$PATH:/opt/fsl/gcc-linaro-4.9-2016.02-x86_64_aarch64-linux-gnu/bin

Step 4:  Customize your uboot source and configuration as needed

Set a custom configuration, customize DDR settings, add or remove board peripherals (if there is interest I can post a step by step for these items also)

Step 5:  Start building

Set your configuration:
$ make CROSS_COMPILE=/opt/fsl/gcc-linaro-4.9-2016.02-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu- - ls2080aqds_qspi_defconfig

(note in this case I am picking up a default uboot configuration for QSPI boot on an LS208x)

Build u-boot


$ make CROSS_COMPILE=/opt/fsl/gcc-linaro-4.9-2016.02-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu- -j8

(Note that here I am building the configuration, -j8 makes up to 8 compile tasks for speeding up the build on multicore hosts).

Output should be a u-boot.bin, uboot. and uboot.elf along with other image files.

I hope this is helpful to someone! 

More
1 2 1,099