使用DMA降低SPI通信过程中内核负荷 Reduce core work load with DMA Module during SPI communication

5 Kudos

客户要求：K60 100MHz芯片作为SPI主机读取片外SPI Flash存储器内容（SPI Flash器件数据准备完成会触发K60 GPIO中断），要求在130~150微秒之间读取九个不连续地址上的数据，每个地址需要读取4个字节，SPI波特率为5MHz。读取SPI Flash存储器，需要使用读取命令（1个字节）外加地址（2个字节）。换言之，每读取一次K60需要发送7个字节（1字节读取命令+2字节地址+4字节空读数据）。同时要求减少内核负担。

Customer requirement: Use K60 100MHz product as SPI master communicate with external SPI Flash device (When data ready, SPI Flash device will trigger K60 GPIO interrupt), it need to read data from 9 discontinuous address, each address read 4 bytes within 130~150us. SPI baud rate is 5MHz. Read data from SPI Flash, it need SPI master send 1byte read command and 2bytes address. In another word, K60 need to send 7bytes(1byte read command+2bytes address+4bytes dummy read) 9times within 130us. SPI communication baud rate is 5MHz. It also require to reduce core work load.

实现方法：使用DMA模块，其中一个DMA通道1用来装载SPI传输TX数据（触发源为SPI TFFF符号，SPI FIFO可装载），另外一个DMA通道0用来接收SPI数据（触发源为SPI RFDF符号，SPI 接收FIFO非空）。通过使用DMA引擎可以自动发起SPI传输，减少内核在SPI传输过程中的干预，达到降低内核工作负荷的效果。SPI模块采用中断方式。

Reality way: Use DMA module, DMA CH0 loads data for SPI transmit, DMA CH1 stores data for SPI receive. DMA triggered by SPI module and SPI module works in interrupt way.

测试平台：TWR-K60D100M， TWR-MEM， IAR ARM Workbench V6.60

TWR-MEM板子提供SPI Flash设备（AT26DF081A），可以通过TWR-K60D100M SPI2模块进行访问。

Test platform: TWR-K60D100M ，TWR-MEM ， IAR ARM Workbench V6.60

SPI Flash AT26DF081A on TWR-MEM board, which could be accessed by TWR-K60D100M via SPI2 module.

测试场景一：读取AT26DF081A设备ID信息

Test scenario 1: Read device ID

AT26DF081A设备提供查询设备ID命令0x9F，返回4个字节设备ID信息（0x1F,0x45,0x01,0x00）。K60作为SPI主机发出查询命令，之后执行4次空写入操作用来读出设备ID信息。测试中SPI传输/接收数据帧大小设定为1个字节（8bit）。由于DSPI模块传输接收均提供4级FIFO，测试中使用两种方式进行SPI数据发送，一种方式使用DMA通道发送读取设备ID查询命令和4次空写入数据，另一种方式通过执行代码（需要内核干预）发送读取设备ID查询命令和4次空写入数据。SPI数据接收均使用DMA完成。为了便于测试使用DMA模块是否降低内核负荷，在DSPI通信同时，主程序在While循环中不停翻转GPIO引脚（PTD7）。

SPI Flash AT26DF081A provides read device ID command(0x9f), will feedback 4 bytes device ID info（0x1F,0x45,0x01,0x00）。K60 works as SPI master send read ID command, then send 4 dummy write data to read back device ID info. During test, SPI data frame size setting to 1byte(8 bit). For DSPI module TX/RX FIFO is 4 entries, so there using two ways do SPI data transfer, one is using DMA CH1 send data, the other way using software code send data. SPI RX using DMA CH0 and in main while loop it will toggle PTD7 pin to show if using DMA module will reduce core work load.

device ID info.jpg

测试流程图（方法一: 使用DMA CH1发送SPI数据）：

Test flow chart (Way1: Using DMA CH1 do SPI TX)：

测试结果（Test Result）：

执行一次读ID信息操作，需要花费12.96us，其中内核处理中断的时间为（2.56+2.72）= 5.28us。

根据客户要求，依照此方法每次发送3个字节，接收4个字节，SPI通信过程中内核负荷时间比率为 (5.28/16.16) =32.7%

SPI read ID operation once, it will take 12.96us, includes core deal with interrupt time 5.28us. According to this way, customer want to TX 3bytes then RX 4bytes, during SPI communication core work load rate is 32.7%

测试流程图（方法二: 使用软件代码发送SPI数据）：

Test flow chart (Way2: Using software code do SPI TX):

测试结果（Test Result）：

执行一次读ID信息操作，需要花费11.6us，其中内核处理中断的时间为（2.48+1.40）= 3.88us。

根据客户要求，依照此方法每次发送3个字节，接收4个字节，SPI通信过程中内核负荷时间比率为 (3.88/14.80) =26.2%

SPI read ID operation once, it will take 11.6us, includes core deal with interrupt time 3.88us. According to this way, customer want to TX 3bytes then RX 4bytes, during SPI communication core work load rate is 26.2%

测试场景二：读取AT26DF081A设备9处不连续地址数据

Test scenario 2: Read 9 discontinue address data from AT26DF081A

AT26DF081A设备提供读阵列命令（0x0B），可以连续读取多个字节数据。根据客户要求，测试读取9处不连续地址数据，每处读取4个字节。根据AT26DF081A设备要求，读阵列命令后需要再发送3个字节地址信息外加1个字节空写入数据，之后K60将会收到数据。即如果要读取4个字节数据，K60作为SPI主机需要发送9个字节数据（1个字节读阵列命令+3个字节地址+1个字节空写入+4个字节空写入）。测试中使用两个DMA通道进行SPI数据收发，两个DMA通道交替工作，DMA通道0（SPI接收）优先级高于DMA通道1（SPI发送）。完成9处数据采集后进入SPI中断，清除EOQ标志并且修正DMA通道配置，进行新的一轮9处数据读取测试。为了便于测试使用DMA模块是否降低内核负荷，在DSPI通信同时，主程序在While循环中不停翻转GPIO引脚（PTD7）。

SPI Flash AT26DF081A provides read array command (0x0B) to sequentially read a continuous stream of data out. With customer requirement, the test will read 9 discontinue address data, each address read 4 bytes data. AT26DF081A datasheet shows read array command with 3 bytes address and 1 dummy byte, then following will be data. In order to read 4 bytes data out, K60 as SPI master need TX 9 bytes data (1byte read array command + 3bytes address + 1byte dummy data + 4bytes dummy data). During the test, it using two DMA channels do SPI TX/RX, each channel alternatively work, DMA CH0（SPI RX） with higher priority than DMA CH1(SPI TX). When finish 9 discontinue address data receive, it will clear EOQ flag and refresh DMA CH0/1 setting in SPI interrupt for next round read 9 discontinue address data test. In main while loop it will toggle PTD7 pin to show if using DMA module will reduce core work load.

read array timing.jpg

测试流程图(Test flow chart)：

测试结果（Test Result）:

读取AT26DF081A设备9处不连续地址数据，需要花费132.32us，其中内核处理中断的时间为2.6us。

根据客户要求，依照此方法每次发送3个字节，接收4个字节，重复9次。SPI通信过程中内核负荷时间比率为 (2.6/103.52) =2.5%

SPI read ID operation once, it will take 132.32us , includes core deal with interrupt time 2.6us. According to this way, customer want to TX 3bytes then RX 4bytes, 9times，during SPI communication core work load rate is 2.5%

DMA模块提供动态加载DMA传输控制描述符（TCD）功能，当需要连续多次执行SPI传输时，使用这种功能可以进一步减少内核负荷。

DMA module provides dynamic scatter/gather feature, which supports automatically loading a new TCD into a DMA channel. Using this feature will reduce core work load in SPI transfer continuously.

测试结果(使用DMA动态加载功能)：

Test Result(Using DMA dynamic scatter/gather feature )：

读取AT26DF081A设备9处不连续地址数据，需要花费130.68us，其中内核处理中断的时间为0.76us。

根据客户要求，依照此方法每次发送3个字节，接收4个字节，重复9次。SPI通信过程中内核负荷时间比率为 (0.76/101.88) =0.75%

SPI read ID operation once, it will take 130.68 us , includes core deal with interrupt time 0.76us. According to this way, customer want to TX 3bytes then RX 4bytes, 9times，during SPI communication core work load rate is 0.75%

测试结论（Test conclusion）

SPI通信过程中DMA模块使用方式不同对于减轻内核负荷作用差异明显。通常SPI进行大量数据传输接收，使用DMA模块能有效减少内核负荷。鉴于客户需求，使用测试场景二的方法可以有效降低内核负荷。

How to use DMA module to reduce core work load, different way lead to different result. In general, using DMA module do amounts of SPI data transfer will reduce core work load . According customer requirement, using test scenario 2 way reduce core work load dramatically。

为什么每次读操作之间需要SPI片选无效（Why need deassert CS signal between each read operation）？

根据AT26DF081A手册要求，读ID命令和读阵列命令都需要使片选信号无效用以结束当前的读操作，换言之如果要开始新的读操作，需要结束之前的通信（使片选信号无效）。

AT26DF081A datasheet indicates deasserting the CS pin will terminate the read operation and put the SO pin into a high-impedance state. In order to start new read command operation, it need deassert the CS pin.

计算客户要求每次读命令间隔时间为SPI实际通信时间（以5MHz波特率发送7个字节重复9次 100.8us）加上内核处理中断时间。

According customer requirement SPI each read command interval time is SPI communication time (TX 7bytes 9times with 5MHz baud rate take 100.8us) add core deal with interrupt time.

测试代码（Test source code）

测试代码基于Kientis 100MHz Rev2例程中的[spi_demo]工程，将测试代码替换<spi_demo.c>和<isr.h>文件即可。

Test source code is based on KINETIS512_V2_SC (Kientis 100MHz Rev2 Example Project) [spi_demo] project, using test code instead of orignial <spi_demo.c>&<isr.h> files.

使用DMA降低SPI通信过程中内核负荷 Reduce core work load with DMA Module during SPI communication

使用DMA降低SPI通信过程中内核负荷 Reduce core work load with DMA Module during SPI communication

使用DMA降低SPI通信过程中内核负荷 Reduce core work load with DMA Module during SPI communication

Kinetis K Series MCUs