i.MX RT Crossover MCUs Knowledge Base

Discussions

i.MX RT6xx The RT6xx is a crossover MCU family is a breakthrough product combining the best of MCU and DSP functionality for ultra-low power secure Machine Learning (ML) / Artificial Intelligence (AI) edge processing, performance-intensive far-field voice and immersive 3D audio playback applications. Fig 1 is the block diagram for the i.MX RT600. It consists of a Cortex-M33 core that runs up to 300 MHz with 32KB FlexSPI cache and an optional HiFi4 DSP that runs up to 600MHz with 96KB DSP cache and 128KB DSP TCM. It also contains a cryptography engine and DSP/Math accelerator in the PowerQuad co-processor. The device has 4.5MB on-chip SRAM. Key features include the rich audio peripherals, the high-speed USB with PHY and the advanced on-chip security. There is a Flexcomm peripheral that supports the configuration of numerous UARTs, SPI, I2C, I2S, etc. Fig 1 Create a eIQ (TensorFlow Lite library) demo In the latest version of SDK for the i.MX RT600, it still doesn't contain the demos about the Machine Learning (ML) / Artificial Intelligence (AI), so it needs the developers to create this kind of demo by themself. To implement it, port the eIQ demos cross from i.MX RT1050/1060 to i.MX RT685 is the quickest way. The below presents the steps of creating a eIQ (TensorFlow Lite library) demo. Greate a new C++ project Install SDK library Fig 2 Create a new C++ project using installed SDK Part In the MCUXpresso IDE User Guide, Chapter 5 Creating New Projects using installed SDK Part Support presents how to create a new project, please refer to it for details Porting tensorflow-lite Copy the tensorflow-lite library to the target project Copy the TensorFlow-lite library corresponding files to the target project Fig 3 Add the paths for the above files Fig 4 Fig 5 Fig 6 Porting main code The main() code is from the post: The “Hello World” of TensorFlow Lite Testing On the MIMXRT685 EVK Board (Fig 7), we record the input data: x_value and the inferenced output data: y_value via the Serial Port (Fig 8). Fig 7 Fig 8 In addition, we use Excel to display the received data against our actual values as the below figure shows. Fig 9 In general, In general, it has replicated the result of the The “Hello World” of TensorFlow Lite Troubleshoot In default, the created project doesn't support print float, so it needs to enable this feature by adding below symbols (Fig 10). Fig 10 When a neural network is executed, the results of one layer are fed into subsequent operations and so must be kept around for some time. The lifetimes of these activation layers vary depending on their position in the graph, and the memory size needed for each is controlled by the shape of the array that a layer writes out. These variations mean that it’s necessary to calculate a plan over time to fit all these temporary buffers into as small an area of memory as possible. Currently, this is done when the model is first loaded by the interpreter, so if the area is not big enough, you’ll see a crash event happen. Regard to this application demo, the default heap size is 4 KB, obviously, it's not big enough to store the model’s input, output, and intermediate tensors, as the codes will be stuck at hard-fault interrupt function (Fig 11). Fig 11 So, how large should we allocate the heap area? That’s a good question. Unfortunately, there’s not a simple answer. Different model architectures have different sizes and numbers of input, output, and intermediate tensors, so it’s difficult to know how much memory we’ll need. The number doesn’t need to be exact—we can reserve more memory than we need—but since microcontrollers have limited RAM, we should keep it as small as possible so there’s space for the rest of our program. We can do this through trial and error. For this application demo, the code works well after increasing ten times than the previous heap size (Fig 12). Fig 12

RT600 MCUXpresso JLINK debug QSPI flash 1 Introduction MIMXRT600-EVK is the NXP official board, which onboard flash is the external octal flash, the octal flash is connected to the RT685 flexSPI portB. In practical usage, the customer board may use other flash types, eg QSPI flash, and connect to the FlexSPI A port. Recently, nxp published one RT600 customer flash application note: https://www.nxp.com/docs/en/application-note/AN13386.pdf This document mainly gives the CMSIS DAP related flash algorithm usage, which modifies the option data to generate the new flash algo for the different flash types. Some customer’s own board may use the RT600 QSPI flash+MCUXPresso+JLINK to debug the application code. Recently, one of the customers find on his own customer board, when they use debugger JLINK associated with the MCUXPresso download code to the RT600 QSPI flash, they meet download issues, but when using the CMSIS DAP as a debugger and the related QSPI cfx file, they can download OK. So this document mainly gives the experience of how to use the RT600, MCUXpresso IDE, and JLINK to download and debug the code which is located in the external QSPI flash. 2 JLINK driver prepare and test MCUXpresso IDE use the JLINK download, it will call the JLINK driver related script and the flash algorithm, but to RT600, the JLINK driver will use the RT600 EVK flexSPI port B octal flash in default, so, if the customer board changes to other flexSPI port and to QSPI flash, they need to provide the related QSPI flash algorithm and script file, otherwise, even they can find the ARM CM33 core, the download will be still failed. If customers want to use the MCUXpresso IDE and the JLINK, they need to make sure the JLINK driver attached tool can do the external flash operation, eg, erase, read, write successfully at first. Now, give the JLINK driver related tool how to add the RT600 QSPI flash driver and script file. 2.1 JLINK driver install Download the Segger JLINK driver from the following link: https://www.segger.com/downloads/jlink/JLink_Windows_V754b_x86_64.exe This document will use the jlink v7.54b to test, other version is similar. Install the driver, the default driver install path is: C:\Program Files\SEGGER 2.2 Universal flashloader RT-UFL RT-UFL v1.0 is a universal flashloader, which uses one .FLM file for all i.MXRT chips, and the different external flash, it is mainly used for the Segger JLINK debugger. RT-UFL v1.0 downoad link: https://github.com/JayHeng/RT-UFL/archive/refs/tags/v1.0.zip Now, to the RT600 QSPI, give the related flash algo file patch. Copy the following path file: \RT-UFL-1.0\algo\SEGGER\JLink_Vxxx To the JLINK install path: \SEGGER\JLink Then copy the content in file: RT-UFL-master\test\SEGGER\JLink_Vxxx\Devices\NXP\iMXRT6xx\archive2\evkmimxrt685.JLinkScript To replace the content in: C:\Program Files\SEGGER\JLink\Devices\NXP\iMXRT_UFL\iMXRT6xx_CortexM33.JLinkScript Otherwise, the MCUXpresso IDE debug reset button function will not work. So, need to add the JLINKScript code for ResetTarget, which will reset the external flash. pic1 The RT-UFL provide 3 types download flash algo: MIMXRT600_UFL_L0, MIMXRT600_UFL_L1, MIMXRT600_UFL_L2. Pic 2 _L0 used for the QSPI Flash and Octal Flash(page size 256 Bytes, sector size 4KB), _L1/2 used for the hyper flash(Page size 512 Bytes，Sector size 4KB/64KB). The JLINKDevices.xml content also can get the detail information. Different name will call different .FLM, the .FLM is the flash algorithm file, the source code can be found in RT-UFL v1.0, it will use different option0 option1 to configure the different external memory when the memory chip can support SFDP. <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L0" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_256B_4KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device>  <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L1" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_512B_4KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device>  <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L2" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_512B_64KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device> 2.3 JLINK commander test Please note, the device need to select as MIMXRT600_UFL_L0 when using the QSPI flash. Pic 3 pic 4 Pic 5 We can find, the JLINK command can realize the external QSPI flash read, erase function. 2.4 Jflash Test Operation steps: Target->connect->production programming Pic 6 We can find, the Jflash also can realize the RT600 external QSPI flash erase and program. Please note, not all the JLINK can support JFLASH, this document is using Segger JLINK plus. 3 MCUXpresso configuration and test MCUXpresso: v11.4.0 SDK_2_10_0_EVK-MIMXRT685 MCUXPresso IDE import the SDK project, eg. Helloworld or led_output. 3.1 QSPI FCB configuration FCB is located from the flash offset address 0X08000400, which is used for the FlexSPI Nor boot configuration, the detailed content of the FCB can be found from the RT600 user manual Table 997. FlexSPI flash configuration block. Different external Flash, the configuration is different, if need to use the QSPI flash, the FCB should use the QSPI related configuration and its own LUT table. Modify SDK project flash_config folder flash_config.c and flash_config.h, LUT contains fast read, status read, write enable, sector erase, block erase, page program, erase the whole chip. If the external QSPI flash command is different, the LUT command should be modified by following the flash datasheet mentioned related command. const flexspi_nor_config_t flexspi_config = { .memConfig = { .tag = FLASH_CONFIG_BLOCK_TAG, .version = FLASH_CONFIG_BLOCK_VERSION, .readSampleClksrc=kFlexSPIReadSampleClk_LoopbackInternally, .csHoldTime = 3, .csSetupTime = 3, .columnAddressWidth = 0, .deviceModeCfgEnable = 0, .deviceModeType = 0, .waitTimeCfgCommands = 0, .deviceModeSeq = {.seqNum = 0, .seqId = 0,}, .deviceModeArg = 0, .configCmdEnable = 0, .configModeType = {0}, .configCmdSeqs = {0}, .configCmdArgs = {0}, .controllerMiscOption = (0), .deviceType = 1, .sflashPadType = kSerialFlash_4Pads, .serialClkFreq = kFlexSpiSerialClk_133MHz, .lutCustomSeqEnable = 0, .sflashA1Size = BOARD_FLASH_SIZE, .sflashA2Size = 0, .sflashB1Size = 0, .sflashB2Size = 0, .csPadSettingOverride = 0, .sclkPadSettingOverride = 0, .dataPadSettingOverride = 0, .dqsPadSettingOverride = 0, .timeoutInMs = 0, .commandInterval = 0, .busyOffset = 0, .busyBitPolarity = 0, .lookupTable = { #if 0 [0] = 0x08180403, [1] = 0x00002404, [4] = 0x24040405, [12] = 0x00000604, [20] = 0x081804D8, [36] = 0x08180402, [37] = 0x00002080, [44] = 0x00000460, #endif // Fast Read [4*0+0] = FLEXSPI_LUT_SEQ(CMD_SDR , FLEXSPI_1PAD, 0xEB, RADDR_SDR, FLEXSPI_4PAD, 0x18), [4*0+1] = FLEXSPI_LUT_SEQ(MODE4_SDR, FLEXSPI_4PAD, 0x00, DUMMY_SDR , FLEXSPI_4PAD, 0x09), [4*0+2] = FLEXSPI_LUT_SEQ(READ_SDR , FLEXSPI_4PAD, 0x04, STOP_EXE , FLEXSPI_1PAD, 0x00), //read status [4*1+0] = FLEXSPI_LUT_SEQ(CMD_SDR , FLEXSPI_1PAD, 0x05, READ_SDR, FLEXSPI_1PAD, 0x04), //write Enable [4*3+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x06, STOP_EXE, FLEXSPI_1PAD, 0), // Sector Erase byte LUTs [4*5+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x20, RADDR_SDR, FLEXSPI_1PAD, 0x18), // Block Erase 64Kbyte LUTs [4*8+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0xD8, RADDR_SDR, FLEXSPI_1PAD, 0x18), //Page Program - single mode [4*9+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x02, RADDR_SDR, FLEXSPI_1PAD, 0x18), [4*9+1] = FLEXSPI_LUT_SEQ(WRITE_SDR, FLEXSPI_1PAD, 0x04, STOP_EXE, FLEXSPI_1PAD, 0x0), //Erase whole chip [4*11+0]= FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x60, STOP_EXE, FLEXSPI_1PAD, 0), }, }, .pageSize = 0x100, .sectorSize = 0x1000, .ipcmdSerialClkFreq = 1, .isUniformBlockSize = 0, .blockSize = 0x10000, }; This code has been tested on the RT685+ QSPI flash MT25QL128ABA1ESE, the code boot is working. 3.2 Debug configuration Configure the JLINK options in the MCUXpresso IDE as the JLINK driver: JLinkGDBServerCL.exe Windows->preferences Pic 7 Press debug, generate .launch file. Pic 8 Run->Debug configurations Pic 9 Choose the device as MIMXRT600_UFL_L0, if the SWD wire is long and not stable, also can define the speed as the fixed low frequency. 3.3 Download and debug test Before download, need to check the RT685 ISP mode configuration, as this document is using the 4 wire QSPI and connect to the FlexSPI A port, so the ISP boot mode should be FlexSPI boot from Port A: ISP2 PIO1_17 low, ISP1 PIO1_16 high, ISP0 PIO1_15 high Click debug button, we can see the code enter the debug mode, and enter the main function, the code address is located in the flexSPI remap address. Pic 10 Click run, we can find the RT685 pin P0_26 is toggling, and the UART interface also can printf information. The application code is working. 4 External SPI flash operation checking To the customer designed board, normally we will use the JLINK command to check whether it can find the ARM core or not at first, make sure the RT chip can work, then will check the external flash operation or not. 4.1 SDK IAP flash code test We can use the SDK related code to test the external flash operation or not at first, the SDK code path is: SDK_2_10_0_EVK-MIMXRT685\boards\evkmimxrt685\driver_examples\iap\iap_flash Then, check the external flash, and modify the code’s related option0, option1 to match the external flash. About the option 0 and option1 definition, we can find it from the RT600 user manual Table 1004.Option0 definition and Table 1005.Option1 definition Pic 11 Pic 12 To the external QSPI flash which is connected to the FLexSPI portA, we can modify the option to the following code: option.option0.U = 0xC0000001;//EXAMPLE_NOR_FLASH; option.option1.U = 0x00000000;//EXAMPLE_NOR_FLASH_OPTION1; Then burn the IAP_flash project to the RT685 internal RAM, debug to run it. Pic 13 We can find, the external QSPI flash initialization, erase, read and write all works, and the memory also can find the correct data. 4.2 MCUBootUtility test Chip enter the ISP mode, then use the MCUBootUtility tool to connect the RT685 and QSPI flash, to do the application code program and read test. ISP mode：ISP2:high, ISP1: high ISP0 low Configure FlexSPI NOR Device Configuration as QSPI, we can use the template: ISSI_IS25LPxxxA_IS25WPxxxA. Pic 14 Click connect to ROM button, check whether it can recognize the external flash: Pic 15 After connection, we can use the tool attached RT685 image to download: NXP-MCUBootUtility-3.3.1\apps\NXP_MIMXRT685-EVK_Rev.E\led_blinky_0x08001000_fdcb.srec Pic 16 We can find, the connection, erase, program and read are all work, it also indicates the RT685+external QSPI flash is working. Then can go to debug it with IDE and debugger.

1. Abstract This article aims to implement the simultaneous input of 4 groups of 48Khz 32bit 2ch audio data on the RT685 platform, and then assemble the received data into a 48Khz 32bit 8ch audio and output it through I2S. This solution is also done at the request of customers, because there are always harmonic problems when customers make it. After analyzing the customer's situation, it is found that the customer has two main problems: (1) Harmonic problem: After receiving 4 channels of 8 bytes, it is directly copied to the sending buffer. This will cause timing problems. It does not take into account the problem of buffering data in the audio data storage pool. The time required to receive enough audio data is at least greater than the time required for copying and sending. Therefore, the problem is reflected in the problem that the customer found harmonic problems when testing the output audio waveform. (2) Audio synchronization problem: After the customer received 4 channels of audio data, he tested the receiving buffer and found that the 4 channels of data were out of sync. Therefore, in order to help customers, I helped customers make this application demo directly, and made a matching test audio source to send a set of 48Khz sampling rate 32bit dual-channel, fixed increment audio data in a loop, such as 0X00-0XFF in a loop. The following is the block diagram of this application platform: Figure 1 System Block Diagram In the above figure, a MIMXRT685-EVK implements the function of outputting 48Khz, 32bit*2ch, and sends data in a loop: 0X00, 0X01….0XFF. Another MIMXRT685-EVK is the focus of this article, which implements 4 groups of I2S to receive data at a sampling rate of 48khz and 32bit*2ch, and then assembles the received data into audio data with a sampling rate of 48Khz and 32bit*8ch and sends it out. In the above figure, in order to reduce the connection of external lines, for BCLK and WS signals, only one group is directly connected to I2S3, and the other I2S2, I2S4, and I2S5 share the I2S3 signal internally. Then, for DATA data, a line is made externally with 4 heads, and connected to the data pins of each group of audio interfaces respectively. The following is a detailed description of this solution. 2. Hardware platform establish The pinouts of the two boards are given below. Because the platform uses many pins, specific allocation is required. 2.1 Audio source board A MIMXRT685-EVK is used as an audio source, and the pinouts for sending 48Khz 32bit*2ch are as follows: Figure 2 Pinout of audio source board 2.2 Audio transceiver board Another MIMXRT685-EVK is used as an audio transceiver board to receive 4-channel audio synchronization data sent by the audio source and assemble it into a 48Khz 32bit*8ch waveform for transmission. Figure 3 Pin assignment of audio transceiver board 2.3 Dual-board hardware connection The source and target connections of the two boards are as follows: Figure 4 Two board pin connection Figure 5 two board connection After the hardware is ready, the software solution and code are provided. 3. Software solution and software implementation In the process of writing the code, we tried many solutions, such as: (1) When receiving, directly assemble it into the TDM format buffer to be sent, and then send it. However, since it is assembled into TDM, one I2S needs to receive 8 byte per frame, and then do the offset to receive the next one. If the reception is carried out according to the 8-byte DMA, the callback of the 4 groups of I2S will enter frequently, resulting in a large CPU load, so this solution is abandoned. (2) The 4 groups of I2S are connected to each other, and the 10ms buffer is connected, and then the DMA method is used to copy from memory to memory. However, since the DMA of RT685 is relatively weak, it can only achieve a maximum offset of 32bit 4word=16byte, that is, a 16-byte offset. However, in fact, a group of audio data is 32bit*2, and 4 groups are 32bit*8=32byte offset, so DMA cannot meet the requirements. Therefore, the DMA memory-to-memory copy solution is abandoned and memcpy is used instead. (3) Use the I2S_RxTransferReceiveDMA function to perform DMA reception. However, in fact, when one group is called, it starts receiving directly, and waits until the next group of I2S interfaces calls I2S_RxTransferReceiveDMA. This has caused an asynchronous situation. Even if the I2S enable is turned off in I2S_RxTransferReceiveDMA, the 4th group of I2S is enabled after the several groups of I2S of I2S_RxTransferReceiveDMA are called. This method can only achieve the synchronization of the reception of the first group of data, because later, it is necessary to go to the callback to re-trigger the reception of the second frame of data. Therefore, the callback of the 4 groups of I2S calls I2S_RxTransferReceiveDMA, which will inevitably cause new synchronization problems. Therefore, this method is abandoned and it is considered to use two groups of DMA descriptors to do ping-pong. In this way, the reception will continue in a loop without the intervention of CPU code. 3.1 Solution Implementation Several solutions have been described above. Finally, we choose to use 4 audio channels to receive audio data and cache 10ms audio data buffer. The conversion from receiving buffer to sending buffer adopts memcpy method, and test whether this copy time can meet the actual needs, to ensure that the buffer pool of receiving buffer is greater than this copy time, which is enough to prepare the sending buffer. The solution for receiving data transfer is as follows: Figure 6 Data buffer transfer The above is 4 groups of I2S receiving their own 10ms data respectively. The buffer is actually prepared for 20ms. A single DMA receives a frame for 10ms, and the other 10ms is used for pingpong buffer. The sending buffer is used to copy the received 4 groups of I2S buffers into a 32bit*8ch array in TDM format, and then two groups of ping-pong buffers are also made. In fact, it is to cache 10ms data. The buffer prepares two groups of 10ms. When the first 10ms frame is received, the second buffer is used to receive it. At the same time, the data of the first buffer is copied to the first buffer of the sending buffer, and the first buffer is used for sending. After the sending is completed, it is transferred to the second buffer to receive and send. In this way, as long as the time is controlled well, there will be no data error problem. The data volume of 10ms is 3840Byte, because the receiving frequency is 48Khz, that is, there are 48000 frames in 1s, and each frame is 32bit*2=8Byte, then 10ms=>4800*8Byte=38400Byte. 3.2 Software code implementation The software code implementation part is mainly divided into 4 I2S receiving signal sharing, I2S DMA pingpong configuration, data transfer, sending I2S and other parts. The details are given below 3.2.1 4-way I2S receiving From the above, we can know that the 4 I2S receiving signal is not completely connected with wires, but adopts the method of sharing BCLK and WS signals and receiving DATA separately. I2S2, I2S4, I2S5 share the BCLK of I2S3, and the WS code is as follows: /* Set shared signal set 0: SCK, WS from Flexcomm1 */ I2S_BRIDGE_SetShareSignalSrc(kI2S_BRIDGE_ShareSet0, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_Flexcomm3); I2S_BRIDGE_SetShareSignalSrc(kI2S_BRIDGE_ShareSet0, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_Flexcomm3); /* Set flexcomm3 SCK, WS from shared signal set 0 */ I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm2, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm2, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm4, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm4, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm5, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm5, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_ShareSet0); 3.2.2 I2S DMA pingpong configuration In order to achieve 4-channel audio synchronization and receive 10ms audio buffer, two I2S DMA descriptors are used to implement the ping-pong function to collect data to two ping-pong buffers in turn. The code is as follows: #define I2S_BUFFER_SIZE 3840 //10ms SDK_ALIGN(static dma_descriptor_t I2S2_s_rxDmaDescriptors[2U], FSL_FEATURE_DMA_LINK_DESCRIPTOR_ALIGN_SIZE); SDK_ALIGN(static dma_descriptor_t I2S3_s_rxDmaDescriptors[2U], FSL_FEATURE_DMA_LINK_DESCRIPTOR_ALIGN_SIZE); SDK_ALIGN(static dma_descriptor_t I2S4_s_rxDmaDescriptors[2U], FSL_FEATURE_DMA_LINK_DESCRIPTOR_ALIGN_SIZE); SDK_ALIGN(static dma_descriptor_t I2S5_s_rxDmaDescriptors[2U], FSL_FEATURE_DMA_LINK_DESCRIPTOR_ALIGN_SIZE); SDK_ALIGN(static uint8_t I2S2_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S3_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S4_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S5_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); static i2s_transfer_t I2S2_s_RxTransfer[2] = {{ .data = I2S2_s_Buffer[0], .dataSize = I2S_BUFFER_SIZE, }, { .data = I2S2_s_Buffer[1], .dataSize = I2S_BUFFER_SIZE, }}; static i2s_transfer_t I2S3_s_RxTransfer[2] = {{ .data = I2S3_s_Buffer[0], .dataSize = I2S_BUFFER_SIZE, }, { .data = I2S3_s_Buffer[1], .dataSize = I2S_BUFFER_SIZE, }}; static i2s_transfer_t I2S4_s_RxTransfer[2] = {{ .data = I2S4_s_Buffer[0], .dataSize = I2S_BUFFER_SIZE, }, { .data = I2S4_s_Buffer[1], .dataSize = I2S_BUFFER_SIZE, }}; static i2s_transfer_t I2S5_s_RxTransfer[2] = {{ .data = I2S5_s_Buffer[0], .dataSize = I2S_BUFFER_SIZE, }, { .data = I2S5_s_Buffer[1], .dataSize = I2S_BUFFER_SIZE, }}; I2S_RxGetDefaultConfig(&I2S2_s_RxConfig); I2S2_s_RxConfig.divider = DEMO_I2S_CLOCK_DIVIDER; I2S2_s_RxConfig.masterSlave = DEMO_I2S_TX_MODE;//DEMO_I2S_RX_MODE I2S_RxInit(DEMO_I2S2_RX, &I2S2_s_RxConfig); I2S_RxGetDefaultConfig(&I2S3_s_RxConfig); I2S3_s_RxConfig.divider = DEMO_I2S_CLOCK_DIVIDER; I2S3_s_RxConfig.masterSlave = DEMO_I2S_TX_MODE;//DEMO_I2S_RX_MODE I2S_RxInit(DEMO_I2S3_RX, &I2S3_s_RxConfig); I2S_RxGetDefaultConfig(&I2S4_s_RxConfig); I2S4_s_RxConfig.divider = DEMO_I2S_CLOCK_DIVIDER; I2S4_s_RxConfig.masterSlave = DEMO_I2S_TX_MODE;//DEMO_I2S_RX_MODE I2S_RxInit(DEMO_I2S4_RX, &I2S4_s_RxConfig); I2S_RxGetDefaultConfig(&I2S5_s_RxConfig); I2S5_s_RxConfig.divider = DEMO_I2S_CLOCK_DIVIDER; I2S5_s_RxConfig.masterSlave = DEMO_I2S_TX_MODE;//DEMO_I2S_RX_MODE I2S_RxInit(DEMO_I2S5_RX, &I2S5_s_RxConfig); DMA_Init(DEMO_DMA); DMA_EnableChannel(DEMO_DMA, DEMO_I2S2_RX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S2_RX_CHANNEL, kDMA_ChannelPriority1); DMA_CreateHandle(&I2S2_s_DmaRxHandle, DEMO_DMA, DEMO_I2S2_RX_CHANNEL); I2S_RxTransferCreateHandleDMA(DEMO_I2S2_RX, &I2S2_s_RxHandle, &I2S2_s_DmaRxHandle, I2S2_RxCallback, (void *)&I2S2_s_RxTransfer); DMA_EnableChannel(DEMO_DMA, DEMO_I2S3_RX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S3_RX_CHANNEL, kDMA_ChannelPriority1); DMA_CreateHandle(&I2S3_s_DmaRxHandle, DEMO_DMA, DEMO_I2S3_RX_CHANNEL); I2S_RxTransferCreateHandleDMA(DEMO_I2S3_RX, &I2S3_s_RxHandle, &I2S3_s_DmaRxHandle, I2S3_RxCallback, (void *)&I2S3_s_RxTransfer); DMA_EnableChannel(DEMO_DMA, DEMO_I2S4_RX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S4_RX_CHANNEL, kDMA_ChannelPriority1); DMA_CreateHandle(&I2S4_s_DmaRxHandle, DEMO_DMA, DEMO_I2S4_RX_CHANNEL); I2S_RxTransferCreateHandleDMA(DEMO_I2S4_RX, &I2S4_s_RxHandle, &I2S4_s_DmaRxHandle, I2S4_RxCallback, (void *)&I2S4_s_RxTransfer); DMA_EnableChannel(DEMO_DMA, DEMO_I2S5_RX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S5_RX_CHANNEL, kDMA_ChannelPriority2); DMA_CreateHandle(&I2S5_s_DmaRxHandle, DEMO_DMA, DEMO_I2S5_RX_CHANNEL); I2S_RxTransferCreateHandleDMA(DEMO_I2S5_RX, &I2S5_s_RxHandle, &I2S5_s_DmaRxHandle, I2S5_RxCallback, (void *)&I2S5_s_RxTransfer); I2S_TransferInstallLoopDMADescriptorMemory(&I2S2_s_RxHandle, I2S2_s_rxDmaDescriptors, 2U); I2S_TransferInstallLoopDMADescriptorMemory(&I2S3_s_RxHandle, I2S3_s_rxDmaDescriptors, 2U); I2S_TransferInstallLoopDMADescriptorMemory(&I2S4_s_RxHandle, I2S4_s_rxDmaDescriptors, 2U); I2S_TransferInstallLoopDMADescriptorMemory(&I2S5_s_RxHandle, I2S5_s_rxDmaDescriptors, 2U); if (I2S_TransferReceiveLoopDMA(DEMO_I2S2_RX, &I2S2_s_RxHandle, &I2S2_s_RxTransfer[0], 2U) != kStatus_Success) { assert(false); } if (I2S_TransferReceiveLoopDMA(DEMO_I2S3_RX, &I2S3_s_RxHandle, &I2S3_s_RxTransfer[0], 2U) != kStatus_Success) { assert(false); } if (I2S_TransferReceiveLoopDMA(DEMO_I2S4_RX, &I2S4_s_RxHandle, &I2S4_s_RxTransfer[0], 2U) != kStatus_Success) { assert(false); } if (I2S_TransferReceiveLoopDMA(DEMO_I2S5_RX, &I2S5_s_RxHandle, &I2S5_s_RxTransfer[0], 2U) != kStatus_Success) { assert(false); } I2S_Enable(DEMO_I2S2_RX); I2S_Enable(DEMO_I2S3_RX); I2S_Enable(DEMO_I2S4_RX); I2S_Enable(DEMO_I2S5_RX); Here, the code has been modified, mainly the I2S_TransferLoopDMA function in fsl_i2s_dma.c, which is blocked: I2S_Enable(base); In order to realize the function of 4-channel synchronous reception. 3.2.3 Audio data received and transferred Because when receiving, each audio interface takes turns to receive its own 2ch data, but when sending, it is necessary to send 4-channel received audio dual-channel data, that is, 32bit*8ch data, so after receiving ping, the ping data needs to be transferred to the sending ping buffer. The code for transfer is as follows: #define I2S_BUFFER_SIZE 3840 //10ms SDK_ALIGN(static uint8_t I2S2_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S3_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S4_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S5_s_Buffer[2][I2S_BUFFER_SIZE], sizeof(uint32_t)); SDK_ALIGN(static uint8_t I2S1_s_Buffer[2][I2S_BUFFER_SIZE*4], sizeof(uint32_t)); if( s_pingpong == 1) { for(ch = 0;ch < 480; ch++) //480=I2S_BUFFER_SIZE(3840)/8 { memcpy(&I2S1_s_Buffer[0][0 + (32*ch)], &I2S2_s_Buffer[0][8*ch], 8); memcpy(&I2S1_s_Buffer[0][8 + (32*ch)], &I2S3_s_Buffer[0][8*ch], 8); memcpy(&I2S1_s_Buffer[0][16 + (32*ch)], &I2S4_s_Buffer[0][8*ch], 8); memcpy(&I2S1_s_Buffer[0][24 + (32*ch)], &I2S5_s_Buffer[0][8*ch], 8); } } else { for(ch = 0;ch < 480; ch++) { memcpy(&I2S1_s_Buffer[1][0 + (32*ch)], &I2S2_s_Buffer[1][8*ch], 8); memcpy(&I2S1_s_Buffer[1][8 + (32*ch)], &I2S3_s_Buffer[1][8*ch], 8); memcpy(&I2S1_s_Buffer[1][16 + (32*ch)], &I2S4_s_Buffer[1][8*ch], 8); memcpy(&I2S1_s_Buffer[1][24 + (32*ch)], &I2S5_s_Buffer[1][8*ch], 8); } } 3.2.4 Send TDM audio code The sending code also uses the I2S DMA method, but because there is no need to send multiple channels at the same time, only a single channel, there is no need to consider the synchronization problem, and no DMA descriptor is used. After the sending buffer is ready, the I2S_TxTransferSendDMA method is used. The code is as follows: I2S_TxGetDefaultConfig(&I2S1_s_TxConfig); I2S1_s_TxConfig.divider = DEMO_I2S1_CLOCK_DIVIDER; I2S1_s_TxConfig.masterSlave = kI2S_MasterSlaveNormalMaster; I2S1_s_TxConfig.wsPol = true; I2S1_s_TxConfig.mode = kI2S_ModeDspWsLong;//kI2S_ModeDspWsShort; I2S1_s_TxConfig.dataLength = 32U; I2S1_s_TxConfig.frameLength = 32 * 8U; I2S1_s_TxConfig.position = DEMO_TDM_DATA_START_POSITION; I2S1_s_TxConfig.pack48 = true; I2S_TxInit(DEMO_I2S1_TX, &I2S1_s_TxConfig); I2S_EnableSecondaryChannel(DEMO_I2S1_TX, kI2S_SecondaryChannel1, false, 64 + DEMO_TDM_DATA_START_POSITION); I2S_EnableSecondaryChannel(DEMO_I2S1_TX, kI2S_SecondaryChannel2, false, 128 + DEMO_TDM_DATA_START_POSITION); I2S_EnableSecondaryChannel(DEMO_I2S1_TX, kI2S_SecondaryChannel3, false, 192 + DEMO_TDM_DATA_START_POSITION); DMA_EnableChannel(DEMO_DMA, DEMO_I2S1_TX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S1_TX_CHANNEL, kDMA_ChannelPriority3); DMA_CreateHandle(&I2S1_s_DmaTxHandle, DEMO_DMA, DEMO_I2S1_TX_CHANNEL); I2S_TxTransferCreateHandleDMA(DEMO_I2S1_TX, &I2S1_s_TxHandle, &I2S1_s_DmaTxHandle, I2S1_TxCallback, (void *)&I2S1_s_TxTransfer); if( s_pingpong == 1) { I2S1_s_TxTransfer.data = I2S1_s_Buffer[0]; I2S1_s_TxTransfer.dataSize = I2S_BUFFER_SIZE*4; I2S_TxTransferSendDMA(DEMO_I2S1_TX, &I2S1_s_TxHandle, I2S1_s_TxTransfer); } else { I2S1_s_TxTransfer.data = I2S1_s_Buffer[1]; I2S1_s_TxTransfer.dataSize = I2S_BUFFER_SIZE*4; I2S_TxTransferSendDMA(DEMO_I2S1_TX, &I2S1_s_TxHandle, I2S1_s_TxTransfer); } 3.2.5 Send and receive I2S callback processing For the receiving I2S2, 3, 4, 5, there are 4 channels in total. Each time 10ms of data is received, a callback will be entered. In the callback, you only need to record the flag. When all 4 flags are recorded, it means that the 4 channels of the same 10ms data have been received, and the data can be copied to the sending buffer. Of course, in order to test whether the callback entry frequency is once every 10ms, this article makes a GPIO callback for testing. The following is the code for recording I2S callback static void I2S2_RxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { s_allRXTriggerred |= 0x01; } static void I2S3_RxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { s_allRXTriggerred |= 0x02; } static void I2S4_RxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { s_allRXTriggerred |= 0x04; } static void I2S5_RxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { /* Enqueue the same original buffer all over again */ s_allRXTriggerred |= 0x08; GPIO_PortToggle(GPIO, 1, 1<<0); if( s_pingpong == 0) { s_pingpong = 1; } else { s_pingpong = 0; } } static void I2S1_TxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { GPIO_PortToggle(GPIO, 1, 1<<8); //__NOP(); } So far, all functions of a MIMXRT685-EVK for 4 I2S reception and 1 I2S TDM transmission have been completed. 3.2.6 Audio source code The audio source is made on another MIMXRT685-EVK to send 48Khz, 32bit*2ch audio data, and the data is sent in a loop from 0X00 to 0XFF. The code is as follows: int main(void) { BOARD_InitBootPins(); BOARD_InitBootClocks(); BOARD_InitDebugConsole(); BOARD_I3C_ReleaseBus(); BOARD_InitI3CPins(); CLOCK_EnableClock(kCLOCK_InputMux); /* attach main clock to I3C (500MHz / 20 = 25MHz). */ CLOCK_AttachClk(kMAIN_CLK_to_I3C_CLK); CLOCK_SetClkDiv(kCLOCK_DivI3cClk, 20); /* attach AUDIO PLL clock to FLEXCOMM1 (I2S1) */ CLOCK_AttachClk(kAUDIO_PLL_to_FLEXCOMM1); /* attach AUDIO PLL clock to FLEXCOMM3 (I2S3) */ CLOCK_AttachClk(kAUDIO_PLL_to_FLEXCOMM3); /* attach AUDIO PLL clock to MCLK */ CLOCK_AttachClk(kAUDIO_PLL_to_MCLK_CLK); CLOCK_SetClkDiv(kCLOCK_DivMclkClk, 1); SYSCTL1->MCLKPINDIR = SYSCTL1_MCLKPINDIR_MCLKPINDIR_MASK; wm8904Config.i2cConfig.codecI2CSourceClock = CLOCK_GetI3cClkFreq(); wm8904Config.mclk_HZ = CLOCK_GetMclkClkFreq(); /* Set shared signal set 0: SCK, WS from Flexcomm1 */ I2S_BRIDGE_SetShareSignalSrc(kI2S_BRIDGE_ShareSet0, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_Flexcomm1); I2S_BRIDGE_SetShareSignalSrc(kI2S_BRIDGE_ShareSet0, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_Flexcomm1); /* Set flexcomm3 SCK, WS from shared signal set 0 */ I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm3, kI2S_BRIDGE_SignalSCK, kI2S_BRIDGE_ShareSet0); I2S_BRIDGE_SetFlexcommSignalShareSet(kI2S_BRIDGE_Flexcomm3, kI2S_BRIDGE_SignalWS, kI2S_BRIDGE_ShareSet0); #if 1 PRINTF("Configure codec\r\n"); /* protocol: i2s * sampleRate: 48K * bitwidth:16 */ if (CODEC_Init(&codecHandle, &boardCodecConfig) != kStatus_Success) { PRINTF("codec_Init failed!\r\n"); assert(false); } /* Initial volume kept low for hearing safety. * Adjust it to your needs, 0-100, 0 for mute, 100 for maximum volume. */ if (CODEC_SetVolume(&codecHandle, kCODEC_PlayChannelHeadphoneLeft | kCODEC_PlayChannelHeadphoneRight, DEMO_CODEC_VOLUME) != kStatus_Success) { assert(false); } PRINTF("Configure I2S\r\n"); #endif /* * masterSlave = kI2S_MasterSlaveNormalMaster; * mode = kI2S_ModeI2sClassic; * rightLow = false; * leftJust = false; * pdmData = false; * sckPol = false; * wsPol = false; * divider = 1; * oneChannel = false; * dataLength = 16; * frameLength = 32; * position = 0; * watermark = 4; * txEmptyZero = true; * pack48 = false; */ I2S_TxGetDefaultConfig(&s_TxConfig); s_TxConfig.divider = DEMO_I2S_CLOCK_DIVIDER; s_TxConfig.masterSlave = DEMO_I2S_TX_MODE; I2S_TxInit(DEMO_I2S_TX, &s_TxConfig); DMA_Init(DEMO_DMA); DMA_EnableChannel(DEMO_DMA, DEMO_I2S_TX_CHANNEL); DMA_SetChannelPriority(DEMO_DMA, DEMO_I2S_TX_CHANNEL, kDMA_ChannelPriority3); DMA_CreateHandle(&s_DmaTxHandle, DEMO_DMA, DEMO_I2S_TX_CHANNEL); StartSoundPlayback(); while (1) { } } static void StartSoundPlayback(void) { PRINTF("Setup looping playback of sine wave\r\n"); s_TxTransfer.data = &g_Music[0]; s_TxTransfer.dataSize = sizeof(g_Music); I2S_TxTransferCreateHandleDMA(DEMO_I2S_TX, &s_TxHandle, &s_DmaTxHandle, TxCallback, (void *)&s_TxTransfer); /* need to queue two transmit buffers so when the first one * finishes transfer, the other immediatelly starts */ I2S_TxTransferSendDMA(DEMO_I2S_TX, &s_TxHandle, s_TxTransfer); I2S_TxTransferSendDMA(DEMO_I2S_TX, &s_TxHandle, s_TxTransfer); } static void TxCallback(I2S_Type *base, i2s_dma_handle_t *handle, status_t completionStatus, void *userData) { /* Enqueue the same original buffer all over again */ i2s_transfer_t *transfer = (i2s_transfer_t *)userData; I2S_TxTransferSendDMA(base, handle, *transfer); } Audio data buffer: Figure 7 Audio source sends buffer The corresponding test results are given: Figure 8 Audio source sending data test It can be seen that the data sent by the audio source is cyclical and can be sent in an increasing loop. 4. Test results There are several points to verify about the test results: (1) 4-channel audio receives pingpong buffer, whether a single buffer is 10ms, that is, a 10ms audio data pool. (2) How long is the data memory copy time, whether it will exceed the length of the receiving audio data pool. (3) Whether the received 4-channel data is synchronized, whether the assembled send buffer data is the 32bit*8ch data assembled from the corresponding 4-channel 2ch data. (4) Whether the sent audio waveform is the correct 32bit*8ch TDM data. The following are the verification test results for these points. 4.1 4 I2S audio 10ms data pool This verification is very simple. Define a pin GPIO, initialize the output to 0, and then reverse it in the received callback interrupt. This article chooses to reverse it in the I2S5 callback. The test results are as follows: Figure 9 ch1 10ms duration Channel 1 is the exact 10ms because of the callback reversal received. Here is a general picture of the test: Figure 10 Time test overview Ch1: I2S5 callback entry frequency Ch2: memory copy time Ch3: Send callback entry frequency It can be seen that the frequency of sending and receiving is 10ms, because the sending frequency is also 48Khz, but because it is 8ch, the data volume is 4 times that of receiving, and all the data of the 4 receiving channels need to be stuffed in. 4.2 Time consumption of receiving and copying to sending buffer For data copying, that is, assembling the data received from 4 I2S channels into 4 buffers into the sending buffer, this time test is on the second channel of the oscilloscope, and the results are as follows: Figure 11 copy data time It can be seen that the copying time is less than 500us, which is much shorter than the 10ms of the audio receiving data pool. Therefore, you can use memcpy casually without worrying about the copying time being too long. This also makes up for the regret that I wanted to use DMA for memory to memory copying before, but it could not be realized due to DMA performance issues. 4.3 Verification of the synchronization of the data received on the 4 I2S In order to verify the synchronization, this article closes the 4 I2S receiving channel after receiving 100times 10ms, and prints out the corresponding 4 I2S audio receiving buffer. The results of the 4 I2S buffer are as follows: Figure12 I2S2 receive buffer Figure 13 I2S3 receive buffer Figure 14 I2S4 receive buffer Figure 15 I2S5 receive buffer It can be seen that the receive buffer data of the 4 I2S are completely synchronized, and all start from 0XB8. 4.4 Send buffer corresponding to 4-channel audio TDM The send buffer is printed after 100 receptions, and then memcpy is performed to the send buffer, and the printout of the send buffer data is as follows: Figure 16 I2S1 transfer buffer It can be seen that the buffer also starts from 0XB8, and the 4 groups of received data are copied to the send buffer and assembled into 32bit*8ch data. It can be seen that the TDM send buffer is also correct. 4.5 Sending 48Khz 32bit 8ch audio data waveform Figure 17 Transmitting and receiving audio waveforms The upper group is the waveform of the audio source, and the lower group is the waveform of TDM transmission. Due to the limitation of the analysis software of the logic analyzer, it can only analyze 2ch 64bit data at most, so only part of the data can be seen here, but from the waveform, it can be seen that the waveform of sending TDM can achieve 32bit*8ch, and every 8byte data in a frame is the same, which also explains the synchronization of 4-channel audio reception. In the above figure, the data of ch2 is actually 00, 01, 02, 03, 04, 05, 06, 07, 4 groups of the same data in one frame, and the waveform can also be seen that there are 4 groups of the same data, and 4 groups of 2ch are enough to form 32bit 8ch TDM. Finally, here is another TDM waveform tested on the oscilloscope: Figure 18 Sending TDM waveform It can be seen that BCLK=12.28Mhz is consistent with the expected 48khz*32bit*8=12.288Mhz. The WS signal is also measured to be 48Khz, which meets the set 48Khz sampling rate. DATA is also transmitting with data changes, and it can be seen that the waveform pattern within a frame is repeated by about 4 groups, which also shows that the 4 groups of received data are synchronized. So far, the function of RT600 4-channel 48KHZ 32bit*2ch input and assembling into 48Khz 32bit*8ch output has been realized!

1. Abstract When using the RT600 SDK USB composite example, the customer found that the default HS interval was 125us, and the code was: mimxrt685audevk_dev_composite_hid_audio_unified_bm. However, in actual applications, the 125us packet interval for data transmission will cause a large interrupt load on the CPU, so the customer hopes to change the interval to a larger during, such as 1ms. After changing interval to 1ms, it is found that the data packet can be sent to the RT chip, but there is indeed a problem with the playback on the RT side, speaker no voice. Changing it to 500us in synchronous mode is possible work, but at 500us, the customer's CPU load still reaches more than 80%, which is not convenient for subsequent application code expansion, so it is still hoped to achieve a 1ms method. This article will give the transmission of different intervals of UAC and provide a solution for 1ms intervals. 2. Test situation Board：MIMXRT685-AUD-EVK SDK：SDK_2_15_000_MIMXRT685-AUD-EVK USB Analyzer：Lecroy USB-TOS2-A01-X 2.1 Platform connection situation First, given the connection between MIMXRT685-AUD-EVK, USB analyzer and audio source. This article mainly tests the RT685 UAC speaker function, that is, USB transmits audio source data to RT685, and plays the audio source through a player (which can be headphones), or speaker. The J7 USB port of EVK is connected to the A port of Lecroy, and the B port of Lecroy is connected to the audio source, which can be a PC or a mobile phone. If using a PC, the speaker needs to be selected as USB AUDIO+HID DEMO, because the name of the board after UAC enumeration is USB AUDIO+HID DEMO. The other end of Lecroy is connected to the PC for transmitting USB bus data. The specific connection diagram is as follows: Fig 1 EVK USB analyzer connection 2.2 Different audio interval data package situation This chapter gives the modifications under different intervals, as well as the results of USB analyzer packet capture. The clock system of the audio synchronous/isochronous audio endpoint can be synchronized through SOF. The USB 1.0 sampling rate must be locked to the 1ms SOF beat. The USB2.0 high-speed HS endpoint can be locked to the 125us SOF beat. If you want to change the interval, it is usually a multiple of 125us, that is, it can be 250us, 500us, 1ms, etc. The modification method is usually to directly modify the USB stack's usb_audio_config.h: #define HS_ISO_OUT_ENDP_INTERVAL (0x01) The relationship between HS_ISO_OUT_ENDP_INTERVAL and the real during time is: 125us*2^( HS_ISO_OUT_ENDP_INTERVAL -1) So: HS_ISO_OUT_ENDP_INTERVAL=1 : 125us HS_ISO_OUT_ENDP_INTERVAL=2 : 250us HS_ISO_OUT_ENDP_INTERVAL=3 : 500us HS_ISO_OUT_ENDP_INTERVAL=4 : 1000us The following are the four situations mentioned above respectively through USB analyzer packet capture test. 2.2.1 125us interval packet situation #define HS_ISO_OUT_ENDP_INTERVAL (0x01) Fig 2 125us interval It can be seen that when the interval is configured to 125us, the SOF beat interval in front of each OUT packet is 125us, and the length of the OUT data packet is 24Byte. The data in this 24Byte packet is the actual audio data. In this case, the playback is normal 2.2.2 250us interval packet situation #define HS_ISO_OUT_ENDP_INTERVAL (0x02) The packet capture configured as 250us is shown in the figure below. It can be seen that the SOF beat interval in front of the two OUT packets is 250us, and the length of the OUT data packet is 48 Byte. In other words, as the interval increases, the length of the data packet also increases proportionally, that is, the transmission time is longer, but a packet contains more data packets. At this time, the RT side also needs to provide more USB receiving buffers to receive data. Fig 3 250us interval This situation, the speaker play normally, have the audio sound. 2.2.3 500us interval packet situation #define HS_ISO_OUT_ENDP_INTERVAL (0x03) Fig 4 500us interval SYNC mode At this time, you can see that the data packet has become 96 Bytes, and the data content is also normal audio data, but the playback has problems and no sound can be heard. However, if you configure: #define USB_DEVICE_AUDIO_USE_SYNC_MODE (0U) It is not the SYNC mode, it can hear the audio sound, the packet situation is: Fig 5 500us interval no SYNC mode However, the asynchronous mode also has problems. After a long period of operation, it may become out of sync and the playback may become stuck. Therefore, it is still necessary to use the 1ms method in the synchronous mode. 2.2.4 1ms interval packet situation #define HS_ISO_OUT_ENDP_INTERVAL (0x04) Fig 6 interval 1ms It can be seen that the data packet length has become 192 bytes, and the SOF beat interval in front of the OUT packet has indeed become 1ms. The audio data packet also looks like normal audio data, so the audio data here is successfully transmitted from USB to RT. However, the playback is abnormal and no sound can be heard. 3. 1ms interval solution Now let's debug the project to find the problem. Because the USB analyzer can show that the audio data of the USB bus is actually transmitted, we must first check whether the USB receives the data packet in the kUSB_DeviceAudioEventStreamRecvResponse event in USB_DeviceAudioCompositeCallback of audio_unified.c, whether the data packet length is correct, and whether the data content looks like audio data. The following is the debug result: Fig 7 data packet received situation So from this point of view, the USB interface data packet is received, and the data looks like real audio data. If it is abnormal data, it is either all 0 or irregular data that changes, but the test result cannot be played. So now to check the I2S playback function, composite.h file, TxCallback function: Fig 8 I2S play data buffer We can see, the real data transfer to the I2S data buffer are always 0, and the reality should be g_composite.audioUnified.startPlayFlag none zero, and also use: s_TxTransfer.dataSize = g_composite.audioUnified.audioPlayTransferSize; s_TxTransfer.data=audioPlayDataBuff + g_composite.audioUnified.tdReadNumberPlay; But, after testing, audioPlayDataBuff data are also 0. So the issue should be the USB received side, receive the data, but when save the data to the audioPlayDataBuff have issues, go back to audio_unified.c Fig 9 USB_DeviceAudioCompositeCallback Fig10 USB_AudioSpeakerPutBuffer After testing, in the Fig 9, audioUnified.tdWriteNumberPlay never larger than g_deviceAudioComposite->audioUnified.audioPlayTransferSize * AUDIO_CLASS_2_0_HS_LOW_LATENCY_TRANSFER_COUNT=192*6 #define AUDIO_CLASS_2_0_HS_LOW_LATENCY_TRANSFER_COUNT \ (0x06U) /* 6 means 6 mico frames (6*125us), make sure the latency is smaller than 1ms for sync mode */ The low latency here is a delay buffer. Therefore, the USB cache data must at least be able to hold the delayed data packet. Unit 1 represents an interval frame. Now it is changed to an interval of 1ms, 6 delay units, which means that the delay here is 6ms. Here, the receiving buffer size needs to be larger, which is related to the following definition: Fig 11 USB device callback g_composite.audioUnified.audioPlayBufferSize = AUDIO_PLAY_BUFFER_SIZE_ONE_FRAME * AUDIO_SPEAKER_DATA_WHOLE_BUFFER_COUNT; #define AUDIO_PLAY_BUFFER_SIZE_ONE_FRAME AUDIO_OUT_TRANSFER_LENGTH_ONE_FRAME #define AUDIO_OUT_FORMAT_CHANNELS (0x02U) #define AUDIO_OUT_FORMAT_SIZE (0x02) #define AUDIO_OUT_SAMPLING_RATE_KHZ (48) /* transfer length during 1 ms */ #define AUDIO_OUT_TRANSFER_LENGTH_ONE_FRAME \ (AUDIO_OUT_SAMPLING_RATE_KHZ * AUDIO_OUT_FORMAT_CHANNELS * AUDIO_OUT_FORMAT_SIZE) #define AUDIO_SPEAKER_DATA_WHOLE_BUFFER_COUNT \ (2U) /* 2 units size buffer (1 unit means the size to play during 1ms) */ Here, try to set larger AUDIO_SPEAKER_DATA_WHOLE_BUFFER_COUNT, let it should be at least can put 6ms data, as 1 unit is 1ms play data size, and it needs to have more than 6ms, so, set to 12 unit, the modified definition: #define AUDIO_SPEAKER_DATA_WHOLE_BUFFER_COUNT 12 Build the project and download it, now capture the USB bus data again: Fig 12 1ms interval data packet play successfully This data waveform is the data that can be played normally. Below is the video of the MIMXRT685-AUD-EVK test. The attachment provides two demos of RT685 and RT1170, both modified to 1ms interval, and the modification method of RT1170 is exactly the same 4. Summarize Whether it is RT600, RT1170, or more precisely the UAC code of the entire RT series, if you need to modify the interval, the main focus is on the following macros in usb_audio_config.h, the following is for 1ms interval: #define HS_ISO_OUT_ENDP_INTERVAL (0x04)//(0x01)//(0X04) #define AUDIO_CLASS_2_0_HS_LOW_LATENCY_TRANSFER_COUNT \ (0x06U) /* 6 means 16 mico frames (6*125us), make sure the latency is smaller than 1ms for ehci high speed */ #define AUDIO_SPEAKER_DATA_WHOLE_BUFFER_COUNT \ (12U)//(2U) /* 2 units size buffer (1 unit means the size to play during 1ms) */ Test video:

LittleFS is a file system used for microcontroller internal flash and external NOR flash. Since it is more suitable for small embedded systems than traditional FAT file systems, more and more people are using it in their projects. So in addition to NOR/NAND flash type storage devices, can LittleFS be used in SD cards? It seems that it is okay too. This article will use the littlefs_shell and sdcard_fatfs demo project in the i.mxRT1050 SDK to make a new littefs_shell project for reading and writing SD cards. This experiment uses MCUXpresso IDE v11.7, and the SDK uses version 2.13. The littleFS file system has only 4 files, of which the current version shown in lfs.h is littleFS 2.5. The first step, of course, is to add SD-related code to the littlefs_shell project. The easiest way is to import another sdcard_fatfs project and copy all of the sdmmc directories into our project. Then copy sdmmc_config.c and sdmmc_config.h in the /board directory, and fsl_usdhc.c and fsl_usdhc.h in the /drivers directory. The second step is to modify the program to include SD card detection and initialization, adding a bridge from LittleFS to SD drivers. Add the following code to littlefs_shell.c. extern sd_card_t m_sdCard; status_t sdcardWaitCardInsert(void) { BOARD_SD_Config(&m_sdCard, NULL, BOARD_SDMMC_SD_HOST_IRQ_PRIORITY, NULL); /* SD host init function */ if (SD_HostInit(&m_sdCard) != kStatus_Success) { PRINTF("\r\nSD host init fail\r\n"); return kStatus_Fail; } /* wait card insert */ if (SD_PollingCardInsert(&m_sdCard, kSD_Inserted) == kStatus_Success) { PRINTF("\r\nCard inserted.\r\n"); /* power off card */ SD_SetCardPower(&m_sdCard, false); /* power on the card */ SD_SetCardPower(&m_sdCard, true); // SdMmc_Init(); } else { PRINTF("\r\nCard detect fail.\r\n"); return kStatus_Fail; } return kStatus_Success; } status_t sd_disk_initialize() { static bool isCardInitialized = false; /* demostrate the normal flow of card re-initialization. If re-initialization is not neccessary, return RES_OK directly will be fine */ if(isCardInitialized) { SD_Deinit(&m_sdCard); } if (kStatus_Success != SD_Init(&m_sdCard)) { SD_Deinit(&m_sdCard); memset(&m_sdCard, 0U, sizeof(m_sdCard)); return kStatus_Fail; } isCardInitialized = true; return kStatus_Success; } In main(), add these code if (sdcardWaitCardInsert() != kStatus_Success) { return -1; } status = sd_disk_initialize(); Next, create two new c files, lfs_sdmmc.c and lfs_sdmmc_bridge.c. The call order is littlefs->lfs_sdmmc.c->lfs_sdmmc_bridge.c->fsl_sd.c. lfs_sdmmc.c and lfs_sdmmc_bridge.c acting as intermediate layers that can connect the LITTLEFS and SD upper layer drivers. One of the things that must be noted is the mapping of addresses. The address given by littleFS is the block address + offset address. See figure below. This is a read command issued by the ‘mount’ command. The block address refers to the address of the erased sector address in SD. The read and write operation uses the smallest read-write block address (BLOCK) of SD, as described below. Therefore, in lfs_sdmmc.c, the address given by littleFS is first converted to the byte address. Then change the SD card read-write address to the BLOCK address in lfs_sdmmc_bridge.c. Since most SD cards today exceed 4GB, the byte address requires a 64-bit variable. Finally, the most important step is littleFS parameter configuration. There is a structure LittlsFS_config in peripherals.c, which contains not only the operation functions of the SD card, but also the read and write sectors and cache size. The setup of this structure is critical. If the setting is not good, it will not only affect the performance, but also cause errors in operation. Before setting it up, let's introduce some of the general ideal of SD card and littleFS. The storage unit of the SD card is BLOCK, and both reading and writing can be carried out according to BLOCK. The size of each block can be different for different cards. For standard SD cards, the length of the block command can be set with CMD16, and the block command length is fixed at 512 bytes for SDHC cards. The SD card is erased sector by sector. The size of each sector needs to be checked in the CSD register of the SD card. If the CSD register ERASE_BLK_EN = 0, Sector is the smallest erase unit, and its unit is "block". The value of sector size is equal to the value of the SECTOR_SIZE field in the CSD register plus 1. For example, if SECTOR_SIZE is 127, then the minimum erase unit is 512*(127+1)=65536 bytes. In addition, sometimes there are doubts, many of the current SD cards actually have wear functions to reduce the loss caused by frequent erasing and writing, and extend the service life. So in fact, delete operations or read and write operations are not necessarily real physical addresses. Instead, it is mapped by the SD controller. But for the user, this mapping is transparent. So don't worry about this affecting normal operation. LittleFS is a lightweight file system that has power loss recovery and dynamic wear leveling compared to FAT systems. Once mounted, littleFS provides a complete set of POSIX-like file and directory functions, so it can be operated like a common file system. LittleFS has only 4 files in total, and it basically does not need to be modified when used. Since the NOR/NAND flash to be operated by LittleFS is essentially a block device, in order to facilitate use, LittleFS is read and written in blocks, and the underlying NOR/NAND Flash interface drivers are carried out in blocks. Let's take a look at the specific content of LittleFS configuration parameters. const struct lfs_config LittleFS_config = { .context = (void*)0, .read = lfs_sdmmc_read, .prog = lfs_sdmmc_prog, .erase = lfs_sdmmc_erase, .sync = lfs_sdmmc_sync, .read_size = 512, .prog_size = 512, .block_size = 65536, .block_count = 128, .block_cycles = 100, .cache_size = 512, .lookahead_size = LITTLEFS_LOOKAHEAD_SIZE }; Among them, the first item (.context) is not used in this project, and is used in the original project to save the offset of the file system stored in Flash. Items two (.read) through five (.sync) point to the handlers for each operation. The sixth item (.read_size) is the smallest unit of read operation. This value is roughly equal to the BLOCK size of the SD card. In the SD card driver, this size has been fixed to 512. So for convenience, it is also set to 512. The seventh item (.prog_size) is the number of bytes written each time, which is 512 bytes like .read_size. The eighth item is .block_size. This can be considered to be the smallest erase block supported by the SD card when performing an erase operation. Here the default value is not important, you need to set it in the program according to the actual value after the SD card is initialized. The card used in this experiment is 64k bytes as an erase block, so 65536 is used directly here. Item 9 (.block_count) is used to indicate how many erasable blocks there are. Multiply the .block_size to get the size of the card. If the card is replaceable, it needs to be determined according to the parameters after the SD card is initialized. The tenth item (.block_cycles) is the number of erase cycles per block. Item 11 (.cache_size) is about the cache buffer. It feels like bigger is better, but actually modifies this value won't work. So still 512. Item 12 (lookahead_size), littleFS uses a lookahead buffer to manage and allocate blocks. A lookahead buffer is a fixed-size bitmap that records information about block allocations within an area. The lookahead buffer only records the information of block allocations in one area, and when you need to know the allocation of other regions, you need to scan the file system to find allocated blocks. If there are no free blocks in the lookahead buffer, you need to move the lookahead buffer to find other free blocks in the file system. The lookahead buffer position shifts one lookahead_size at a time. Use the original value here. That’s all for the porting work. We can test the project now. You can see it works fine. The littleFS-SD project can read/write/create folder and erase. And it also support append to an exist file. But after more testing, a problem was found, if you repeatedly add->-close->-add-> close a file, the file will open more and more slowly, even taking a few seconds. This is what should be added and is not written directly in the last block of the file, but will apply for a new block, regardless of whether the previous block is full or not. See figure below. The figure above prints out all the read, write, and erase operations used in each write command. You can see that each time in the lfs_file_open there is one more read than the last write operation. In this way, after dozens or hundreds of cycles, a file will involve many blocks. It is very time-consuming to read these blocks in turn. Tests found that more than 100 read took longer than seconds. To speed things up, it is recommended to copy the contents of one file to another file after adding it dozens of times. In this way, the scattered content will be consolidated to write a small number of blocks. This can greatly speed up reading and writing.

配置RT600开发环境 RT600开发入门培训视频。 https://www.nxp.com/document/guide/getting-started-with-i-mx-rt600-evaluation-kit:GS-MIMXRT685-EVK?&tid=vanGS-MIMXRT685-EVK#title2.1 下载I.MX RT600 SDK。下载链接: https://mcuxpresso.nxp.com/en/select?device=EVK-MIMXRT685 下载MCUXpresso IDE。注意需要安装MCUXpresso IDE 11.1.1及最新版本。https://www.nxp.com/webapp/swlicensing/sso/downloadSoftware.sp?catid=MCUXPRESSO 下载安装LPCScrypt，可以将默认板载的CMSIS-DAP固件升级改为J-LINK。通过J-LINK，可以下载调试HiFi4 DSP固件。下载链接https://www.nxp.com/design/microcontrollers-developer-resources/lpc-microcontroller-utilities/lpcscrypt-v2-1-1:LPCSCRYPT?&tab=Design_Tools_Tab 下载安装J-LINK驱动。下载链接https://www.segger.com/downloads/jlink/ 下载安装Cadence HiFi 4 DSP IDE for MIMXRT600。第一次下载，注册用户https://tensilicatools.com/register/。国内用户注册时，如果页面没有出现下面人机身份验证，说明IP被GW Firewall屏蔽了。需要通过代理或者其他特殊手段，否则用户注册将无法成功提交。下载HiFi DSP Development Tools for i.MX RT600开发工具。 https://tensilicatools.com/download/rt600-download-page/ 申请License for RT600 SDK。注意输入绑定网卡MAC地址时，需要去除中间‘：’等字符，否则提示失败。申请成功后，可以下载License文件。启动Xplorer 8.0.13后，在菜单Help -- Xplorer License Keys安装License文件。安装成功后显示如下： Xplorer下载调试器配置。将xt-ocd.exe所在目录加入到系统Path环境变量。使能”Use XOCD Manager”，指定Topology File 设置Download binary为Always，取消每次下载前都弹出提示框，节省下载时间。通过J-Link下载HiFi4 DSP固件，可以单步调试代码。

Source code: https://github.com/JayHeng/NXP-MCUBootUtility 【v2.0.0】 Features: > 1. Support i.MXRT5xx A0, i.MXRT6xx A0 > 支持i.MXRT5xx A0, i.MXRT6xx A0 > 2. Support i.MXRT1011, i.MXRT117x A0 > 支持i.MXRT1011, i.MXRT117x A0 > 3. [RTyyyy] Support OTFAD encryption secure boot case (SNVS Key, User Key) > [RTyyyy] 支持基于OTFAD实现的安全加密启动（唯一SNVS key，用户自定义key） > 4. [RTxxx] Support both UART and USB-HID ISP modes > [RTxxx] 支持UART和USB-HID两种串行编程方式（COM端口/USB设备自动识别） > 5. [RTxxx] Support for converting bare image into bootable image > [RTxxx] 支持将裸源image文件自动转换成i.MXRT能启动的Bootable image > 6. [RTxxx] Original image can be a bootable image (with FDCB) > [RTxxx] 用户输入的源程序文件可以包含i.MXRT启动头 (FDCB) > 7. [RTxxx] Support for loading bootable image into FlexSPI/QuadSPI NOR boot device > [RTxxx] 支持下载Bootable image进主动启动设备 - FlexSPI/QuadSPI NOR接口Flash > 8. [RTxxx] Support development boot case (Unsigned, CRC) > [RTxxx] 支持用于开发阶段的非安全加密启动（未签名，CRC校验） > 9. Add Execute action support for Flash Programmer > 在通用Flash编程器模式下增加执行(跳转)操作 > 10. [RTyyyy] Can show FlexRAM info in device status > [RTyyyy] 支持在device status里显示当前FlexRAM配置情况 Improvements: > 1. [RTyyyy] Improve stability of USB connection of i.MXRT105x board > [RTyyyy] 提高i.MXRT105x目标板USB连接稳定性 > 2. Can write/read RAM via Flash Programmer > 通用Flash编程器里也支持读写RAM > 3. [RTyyyy] Provide Flashloader resident option to adapt to different FlexRAM configurations > [RTyyyy] 提供Flashloader执行空间选项以适应不同的FlexRAM配置 Bugfixes: > 1. [RTyyyy] Sometimes tool will report error "xx.bat file cannot be found" > [RTyyyy] 有时候生成证书时会提示bat文件无法找到，导致证书无法生成 > 2. [RTyyyy] Editing mixed eFuse fields is not working as expected > [RTyyyy] 可视化方式去编辑混合eFuse区域并没有生效 > 3. [RTyyyy] Cannot support 32MB or larger LPSPI NOR/EEPROM device > [RTyyyy] 无法支持32MB及以上容量的LPSPI NOR/EEPROM设备 > 4. Cannot erase/read the last two pages of boot device via Flash Programmer > 在通用Flash编程器模式下无法擦除/读取外部启动设备的最后两个Page

The i.MX RT600 MCU includes a Cadence® Tensilica® HiFi 4 DSP running at frequencies of up to 600 MHz.The XOS embedded kernel from Cadence is designed for efficient operation on embedded system built using the Xtensa architecture. Although various parts of XOS continue to be tuned for efficient performance on the Xtensa hardware, most of the code is written in standard C and is not Xtensa-specific. Click here to access the full application note.

i.MX RT Crossover MCUs Knowledge Base

i.MX RT Crossover MCUs Knowledge Base

Discussions

Create a eIQ (TensorFlow Lite library) demo for i.MX RT6xx

RT600 MCUXpresso JLINK debug QSPI flash

RT600 4 I2S input to 1 TDM output solution

RT600/1170 UAC change OUT endpoint interval methods

Use LittleFS as SD card file system

88W8987/88W8977/IW416 SDK drive code issue

使用恩智浦微控制器轻松开发GUI

Some tips on how to setup RT600 development environment

Introduction to the i.MX RT600 Crossover MCU

NXP Tech Session - Advanced Industrial ML Applications are Now Possible with NXP MCUs

Application Note (AN12749): I2S Transmit and Receive on i.MX RT600 MCUs with HiFi 4

One-stop secure boot tool: NXP-MCUBootUtility v2.0.0 supports RT1011/RT117x/RT5xx/RT6xx

Application Note (AN12765): DSP Enablement for i.MX RT600 MCUs - XOS Use Cases

AN12789 - i.MX RT600 Dual-Core Communication and Debugging

Application Note (AN12751): How to Enable Recovery Boot from QSPI Flash for i.MX RT600 MCU

NXP Tech Session - Bringing Low Power, High Performance Audio and Voice to Market on the i.MX RT600 Crossover MCU

Application Note (AN12762): Develop Audio Player In HiFi 4 for i.MX RT600 MCU