i.MX RT Knowledge Base

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX RT Knowledge Base

Labels

Discussions

Sort by:
The i.MXRT1060 provides tightly coupled GPIOs to be accessed with high frequency. RT1060 provides two sets of GPIOs registers to control pad output. GPIO1 to GPIO3 are general GPIOs, and GPIO6 to GPIO8 are tightly GPIOs, but they share the same pad, which means the gpio pin can select from GPIO1/2/3 or GPIO6/7/8. The registers IOMUXC_GPR_GPR26, IOMUXC_GPR_GPR27, and IOMUXC_GPR_GPR28 are for GPIO selection. To select the gpio pin between GPIO1/2/3 or GPIO6/7/8 you can use MCUXpresso Config Tools. For example, if you select pin G10 you can select either GPIOI_IO11 for normal GPIO or GPIO6_IO11 for fast GPIO.  I made an example based on the SDK v2.7.0 to compare the speed of Normal GPIO and Fast GPIO. For this, I used pin G10 (GPIOI_IO11 and GPIO6_IO11). Firstly, I used the normal GPIO pin (GPIOI_IO11). I will toggle the pin by writing directly to the GPIO_DR register. Notice that you can access this pin through J22 pin 3 in the evaluation board, so you can measure the performance of the pin. Here are the results: With the normal GPIO pin, we reach a period of 160ns when writing directly to the GPIO_DR register. Now, if we change to the fast GPIO and use the same instructions we have the following results. As you can see when using the fast GPIO pin, the period of the signal it's almost one-third of the period when using a normal GPIO. Now, The A1 silicon of the RT1060 has a new GPIO toggle feature. If we toggle the pin with the new register DR_TOGGLE instead of the GPIO_DR we will get better performance with both pins, normal GPIO and fast. Here are the results of the normal GPIO with the DR_TOGGLE register. As you can see when using the register DR_TOGGLE along with the normal GPIO pin we get a period of around 53 ns while when writing to the GPIO_DR register we got 160 ns. When using the register DR_TOGGLE and the fast GPIO we will get the best performance of the pin. Results are shown below. Many thanks to @jorge_a_vazquez for his valuable help with this document. Hope this helps! Best regards, Victor.
View full article
Introduction i.MX RT ROM bootloader provides a wealth of options to enable user programs to start in various ways. In some cases, people want to copy application image from Flash or other storage device to SDRAM and run there. In this article, I record three ways to realize this. Section 2 and 3 shows load image from NOR flash. Section 4 shows load image from SD card. Software and Tools: MCUXpresso IDE v11.1 NXP-MCUBootUtility 2.2.0   MIMXRT1060-EVK   RT1060 SDK v2.7.0   Win10   Add DCD by MCUxpresso IDE If customers use MCUXpresso to develop the project, they can add DCD head by MCUXpresso. To show the work flow, we take evkmimxrt1020_iled_blinky as the example. Step 1: Add the following to Compiler options: XIP_BOOT_HEADER_DCD_ENABLE=1 SKIP_SYSCLK_INIT Step 2: Modify the Memory Configuration Step 3: Select link application to RAM Step 4. Compile the project. MCUXpresso will generate linker script automatically. Step 5. Since the code should be linked to RAM, MCUXpresso will not prefix IVT and DCD. We can add these link information to linker script manually. Add below code to .ld file’s head.     .boot_hdr : ALIGN(4)     {         FILL(0xff)         __boot_hdr_start__ = ABSOLUTE(.) ;         KEEP(*(.boot_hdr.conf))         . = 0x1000 ;         KEEP(*(.boot_hdr.ivt))         . = 0x1020 ;         KEEP(*(.boot_hdr.boot_data))         . = 0x1030 ;         KEEP(*(.boot_hdr.dcd_data))         __boot_hdr_end__ = ABSOLUTE(.) ;         . = 0x2000 ;     } >BOARD_SDRAM   Then deselect “Manage linker script” in last screenshot. Step 6. Recompile the project, IVT/DCD/BOOT_DATA will be add to your project. Then right click the project axf file->Binary Utilities->Create S-record.   After all these step, you can open MCUBootUtility and download the .s19 file to NOR flash.   Add DCD by MCUBootUtility We can also keep the linker script managed by IDE. MCUBootUtility can add head too. Sometimes it is more flexible than other manners. Step 1. This time BOARD_SDRAM location should be changed to 0x80002000 while the size should be 0x1cff000. This is because the start 8k space in bootable image is saved for IVT and DCD. Step 2. compile the project and generate the .s19 file. Step 3. Open MCUBootUtility. In MCUBootUtility, we should first set the Device Configuration Data. Here I use MIMXRT1060_EVK. So I select the DCD bin file in NXP-MCUBootUtility-2.2.0\src\targets\MIMXRT1062. After that, select the application image file and click All-in-one Action button. MCUBootUtility can do all the work without any manual operation.   Boot from SD card to SDRAM In some application, customer don’t want XIP. They want to use SD card to keep application image and run the code in RAM. But if the code size is bigger than OCRAM size, they have to copy image into SDRAM when startup. With MCUBootUtility’s help, this work is very easy too.  User just need to change the memory map which is located to 0x80001000.   In MCUBootUtility, select the Boot Device to “uSDHC SD” and insert SD card. Then connect the target board. If RT1060 can read the SD card, it will display the SD card information. Then same as last section, set the DCD file and application image file. Click the All-in-One Action button, MCUBootUtility will generate the bootable image and write it to SD card.   SD card has huge capacity. It's too wasteful to only store boot image. People may ask that can they also create a FAT32 system and store more data file in it? Yes, but you need tool's help. When booting from SD card, ROM code read IVT and DCD from SD card address 0x400. To FAT32, the first 512 bytes in SD is for MBR(MAIN BOOT RECORD). Data in address 0x1c6 in MBR reords the partition start address. If the space from MBR to partition start address is big enough to store boot image, then FAT32 system and boot image can live in peace. .   Conclusions:      To help Boot ROM initialize SDRAM, DCD must be placed at right place and indexed by IVT correctly. When our code seems not work, we should first check IVT and DCD.          The IVT offset from the base address for each boot device type is defined in the table below. The location of the IVT is the only fixed requirement by the ROM. The remainder or the image memory map is flexible and is determined by the contents of the IVT.
View full article
Introduction  This document is an extension of section 3.1.3, “Software implementation” from the application note AN12077, using the i.MX RT FlexRAM. It's important that before continue reading this document, you read this application note carefully.  Link to the application note.  Section 3.1.3 of the application note explains how to reallocate the FlexRAM through software within the startup code of your application. This document will go into further detail on all the implications of making these modifications and what is the best way to do it.  Prerequisites RT10xx-EVK  The latest SDK which you can download from the following link: Welcome | MCUXpresso SDK Builder MCUXpresso IDE Internal SRAM  The amount of internal SRAM varies depending on the RT. In some cases, not all the internal SRAM can be reallocated with the FlexRAM.  RT  Internal SRAM FlexRAM RT1010 Up to 128 KB Up to 128 KB RT1015 Up to 128 KB Up to 128 KB RT1020 Up to 256 KB Up to 256 KB RT1050 Up to 512 KB Up to 512 KB RT1060 Up to 1MB  Up to 512 KB RT1064 Up to 1MB Up to 512 KB   In the case of the RT106x, only 512 KB out of the 1MB of internal SRAM can be reallocated through the FlexRAM as DTCM, ITCM, and OCRAM. The remaining 512 KB are from OCRAM and cannot be reallocated. For all the other RT10xx you can reallocate the whole internal SRAM either as DTCM, ITCM, and OCRAM. Section 3.1.3.1 of the application note explains the limitations of the size when reallocating the FlexRAM. One thing that's important to mention is that the ROM bootloader in all the RT10xx parts uses the OCRAM, hence you should keep some  OCRAM when reallocating the FlexRAM, this doesn't apply to the RT106x since you will always have the 512 KB of OCRAM that cannot be reallocated. To know more about how many OCRAM each RT family needs please refer to section 2.1.1.1 of the application note. Implementation in MCUXpresso IDE First, you need to import any of the SDK examples into your MCUXpresso IDE workspace. In my case, I imported the igpio_led_output example for the RT1050-EVKB. If you compile this project, you will see that the default configuration for the FlexRAM on the RT1050-EVKB is the following:  SRAM_DTC 128 KB SRAM_ITC 128 KB SRAM_OC 256 KB   Now we need to go to the Reset handler located in the file startup_mimxrt1052.c. Reallocating the FlexRAM has to be done before the FlexRAM is configured, this is why it's done inside the Reset Handler.  The registers that we need to modify to reallocate the FlexRAM are IOMUXC_GPR_GPR16, and IOMUXC_GPR_GPR17. So first we need to have in hand the addresses of these three registers. Register Address IOMUXC_GPR_GPR16 0x400AC040 IOMUXC_GPR_GPR17 0x400AC044   Now, we need to determine how we want to reallocate the FlexRAM to see the value that we need to load into register IOMUXC_GPR_GPR17. In my case, I want to have the following configuration:  SRAM_DTC 256 KB SRAM_ITC 128 KB SRAM_OC 128 KB   When choosing the new sizes of the FlexRAM be sure that you choose a configuration that you can also apply through the FlexRAM fuses, I will explain the reason for this later. The configurations that you can achieve through the fuses are shown in the Fusemap chapter of the reference manual in the table named "Fusemap Descriptions", the fuse name is "Default_FlexRAM_Part".  Based on the following explanation of the IOMUXC_GPR_GPR17 register: The value that I need to load to the register is 0xAAAAFF55. Where the first  4 banks correspond to the 128KB of SRAM_OC, the next 4 banks correspond to the 128KB of SRAM_ITC and the last 8 banks are the 256KB of SRAM_DTC.  Now, that we have all the addresses and the values that we need we can start writing the code in the Reset handler. The first thing to do is load the new value into the register IOMUXC_GPR_GPR17. After, we need to configure register IOMUXC_GPR_GPR16 to specify that the FlexRAM bank configuration should be taken from register IOMUXC_GPR_GPR17 instead of the fuses. Then if in your new configuration of the FlexRAM either the SRAM_DTC or SRAM_ITC are of size 0, you need to disable these memories in the register IOMUXC_GPR_GPR16. At the end your code should look like the following:    void ResetISR(void) { // Disable interrupts __asm volatile ("cpsid i"); /* Reallocating the FlexRAM */ __asm (".syntax unified\n" "LDR R0, =0x400ac044\n"//Address of register IOMUXC_GPR_GPR17 "LDR R1, =0xaaaaff55\n"//FlexRAM configuration DTC = 265KB, ITC = 128KB, OC = 128KB "STR R1,[R0]\n" "LDR R0,=0x400ac040\n"//Address of register IOMUXC_GPR_GPR16 "LDR R1,[R0]\n" "ORR R1,R1,#4\n"//The 4 corresponds to setting the FLEXRAM_BANK_CFG_SEL bit in register IOMUXC_GPR_GPR16 "STR R1,[R0]\n" #ifdef FLEXRAM_ITCM_ZERO_SIZE "LDR R0,=0x400ac040\n"//Address of register IOMUXC_GPR_GPR16 "LDR R1,[R0]\n" "AND R1,R1,#0xfffffffe\n"//Disabling SRAM_ITC in register IOMUXC_GPR_GPR16 "STR R1,[R0]\n" #endif #ifdef FLEXRAM_DTCM_ZERO_SIZE "LDR R0,=0x400ac040\n"//Address of register IOMUXC_GPR_GPR16 "LDR R1,[R0]\n" "AND R1,R1,#0xfffffffd\n"//Disabling SRAM_DTC in register IOMUXC_GPR_GPR16 "STR R1,[R0]\n" #endif ".syntax divided\n"); #if defined (__USE_CMSIS) // If __USE_CMSIS defined, then call CMSIS SystemInit code SystemInit(); ...‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍   If you compile your project you will see the memory distribution that appears on the console is still the default configuration.  This is because we did modify the Reset handler to reallocate the FlexRAM but we haven't modified the linker file to match these new sizes. To do this you need to go to the properties of your project. Once in the properties, you need to go to C/C++ Build -> MCU settings. Once you are in the MCU settings you need to modify the sizes of the SRAM memories to match the new configuration.  When you make these changes click Apply and Close. After making these changes if you compile the project you will see the memory distribution that appears in the console is now matching the new sizes.  Now we need to modify the Memory Protection Unit (MPU) to match these new sizes of the memories. To do this you need to go to the function BOARD_ConfigMPU inside the file board.c. Inside this function, you need to locate regions 5, 6, and 7 which correspond to SRAM_ITC, SRAM_DTC, and SRAM_OC respectively. Same as for register IOMUXC_GPR_GPR14, if the new size of your memory is not 32, 64, 128, 256, or 512 you need to choose the next greater number. Your configuration should look like the following:    /* Region 5 setting: Memory with Normal type, not shareable, outer/inner write back */ MPU->RBAR = ARM_MPU_RBAR(5, 0x00000000U); MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 0, 0, 1, 1, 0, ARM_MPU_REGION_SIZE_128KB); /* Region 6 setting: Memory with Normal type, not shareable, outer/inner write back */ MPU->RBAR = ARM_MPU_RBAR(6, 0x20000000U); MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 0, 0, 1, 1, 0, ARM_MPU_REGION_SIZE_256KB); /* Region 7 setting: Memory with Normal type, not shareable, outer/inner write back */ MPU->RBAR = ARM_MPU_RBAR(7, 0x20200000U); MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 0, 0, 1, 1, 0, ARM_MPU_REGION_SIZE_128KB);‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍   We need to change the image entry address to the Reset handler. To do this, you need to go to the file fsl_flexspi_nor_boot.c inside the xip folder. You need to declare the ResetISR and change the entry address in the image vector table.  Finally, we need to place the stack at the start of the DTCM memory. To do this, we need to go to the properties of your project. From there, we have to C/C++ Build and Manage Linker Script.  From there, we will need to add two more assembly instructions in our ResetISR function. We have to add these two instructions at the beginning of our assembly code:  In the attached c file, you'll find all the assembly instructions mentioned above.  That's it, these are all the changes that you need to make to reallocate the FlexRAM during the startup.  Debug Session  To verify that all the modifications that we just did were correct we will launch the debug session. As soon as we reach the main, before running the application, we will go to the peripheral view to see registers IOMUXC_GPR_GPR16, and IOMUXC_GPR_GPR17 and verify that the values are the correct ones. In register IOMUXC_GPR_GPR16 as shown in the image below we configure the FLEXRAM_BANK_CFG_SEL as 1 to use the use register IOMUXC_GPR_GPR17 to configure the FlexRAM.  Finally, in register IOMUXC_GPR_GPR17 we can see the value 0xAAAAFF55 that corresponds to the new configuration.  Reallocating the FlexRAM through the Fuses  We just saw how to reallocate the FlexRAM through software by writing some code in the Reset Handler. This procedure works fine, however, it's recommended that you use this approach to test the different sizes that you can configure but once you find the correct configuration for your application we highly recommend that you configure these new sizes through the fuses instead of using the register IOMUXC_GPR_GPR17. There are lots of dangerous areas in reconfiguring the FlexRAM in code. It pretty much all boils down to the fact that any code/data/stack information written to the RAM can end up changing location during the reallocation.  This is the reason why once you find the correct configuration, you should apply it through the fuses. If you use the fuses to configure the FlexRAM, then you don't have the same concerns about moving around code and data, as the fuse settings are applied as a hardware default.  Keep in mind that once you burn the fuses there's no way back! This is why it's important that you first try the configuration through the software method. Once you burn the fuses you won't need to modify the Reset handler, you only need to modify the MPU to change the size of regions as we saw before and the MCU settings of your project to match the new memory sizes that you configured through the fuses.  The fuse in charge of the FlexRAM configuration is Default_FlexRAM_Part, the address of this fuse is 0x6D0[15:13]. You can find more information about this fuse and the different configurations in the Fusemap chapter of the reference manual.  To burn the fuses I recommend using either the blhost or the MCUBootUtility.  Link to download the blhost.  Link to the MCUBootUtility webpage.    I hope you find this document helpful!  Víctor Jiménez 
View full article
Introduction The RT1064 is the only Crossover MCU from the RT family that comes with an on-chip flash, it has a 4MB Winbond W25Q32JV memory. It's important to remark that it's not an internal memory but an on-chip flash connected to the RT1064 through the FlexSPI2 interface. Having this flash eliminates the need of an external memory. Although the intended use of the on-chip flash is to store your application and do XiP, there might be cases where you would like to use some space of this memory as NVM. This document will explain how to modify one of the SDK example projects to achieve this.  Prerequisites RT1064-EVK  The latest SDK which you can download from the following link: Welcome | MCUXpresso SDK Builder. In this document, I will use MCUXpresso IDE but you can use the IDE of your preference. The modifications that you need to make are the same despite the IDE that you are using.  Importing the SDK example  The RT1064-EVK comes with two external flash memories, 512 Mb Hyper Flash and 64 Mb QSPI Flash. The RT1064-SDK includes two example projects that demonstrate how to use any of these two external flash memories as NVM. In this document, we will take as a base one of these examples to modify it to use the on-chip flash as NVM. Once you downloaded and imported the SDK into MCUXpresso IDE, you will need to import the example flexspi_nor_polling_transfer into your workspace.    Modifying the SDK example The external NOR Flash memory of the EVK is connected to the RT1064 through the FlexSPI1 interface. Due to this, the example project that we just imported initializes the FlexSPI1 interface pins. In our case, we want to use the on-chip flash that is connected through the FlexSPI2 instance, since we will boot from this memory, the ROM bootloader will configure the pins of this FlexSPI interface. So, in the function BOARD_InitPins we can delete all the pins that were related to the FlexSPI1 interface. At the end your function should look like the following:  void BOARD_InitPins(void) { CLOCK_EnableClock(kCLOCK_Iomuxc); /* iomuxc clock (iomuxc_clk_enable): 0x03u */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B0_12_LPUART1_TX, /* GPIO_AD_B0_12 is configured as LPUART1_TX */ 0U); /* Software Input On Field: Input Path is determined by functionality */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B0_13_LPUART1_RX, /* GPIO_AD_B0_13 is configured as LPUART1_RX */ 0U); /* Software Input On Field: Input Path is determined by functionality */ /* Software Input On Field: Force input path of pad GPIO_SD_B1_11 */ IOMUXC_SetPinConfig( IOMUXC_GPIO_AD_B0_12_LPUART1_TX, /* GPIO_AD_B0_12 PAD functional properties : */ 0x10B0u); /* Slew Rate Field: Slow Slew Rate Drive Strength Field: R0/6 Speed Field: medium(100MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_AD_B0_13_LPUART1_RX, /* GPIO_AD_B0_13 PAD functional properties : */ 0x10B0u); /* Slew Rate Field: Slow Slew Rate Drive Strength Field: R0/6 Speed Field: medium(100MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ } ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ Function flexspi_nor_flash_init initializes the FlexSPI interface but in our case, the ROM bootloader already did this for us. So we need to make some modifications to this function as well. The only things that we will need from this function are to update the LUT and to do a software reset of the FlexSPI2 interface. At the end your function should look like the following:  void flexspi_nor_flash_init(FLEXSPI_Type *base) { /* Update LUT table. */ FLEXSPI_UpdateLUT(base, 0, customLUT, CUSTOM_LUT_LENGTH); /* Do software reset. */ FLEXSPI_SoftwareReset(base); } ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ Inside the file app.h, we need to change some macros since now we will use a different memory and a different FlexSPI interface. The following macros are the ones that you need to modify:  Macro Before After Observations EXAMPLE_FLEXPI FLEXSPI FLEXSPI2 On-chip flash is connected through FlexSPI2 interface  FLASH_SIZE 0x2000 0x200 The size of the on-chip flash is different from the external NOR flash  EXAMPLE_FLEXSPI_AMBA_BASE FlexSPI_AMBA_BASE FlexSPI2_AMBA_BASE On-chip flash is connected through FlexSPI2 interface  EXAMPLE_SECTOR 4 100 The size of the on-chip flash is less than the external NOR flash, hence we will decrease the sector size to avoid erasing our application  Everything else in this file remains the same.  Finally, there are two changes that we need to make in the customLUT. To understand these changes you need to analyze the datasheets of the memories. But in the end, the two modifications are the followings:  Erase sector command of the on-chip flash is 0x20 instead of 0xD7.  Exchange the FLEXSPI command index for NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD and NOR_CMD_LUT_SEQ_IDX_READ_NORMAL to align with XIP settings. At the end your customLUT should look like the following:  const uint32_t customLUT[CUSTOM_LUT_LENGTH] = { /* Fast read quad mode - SDR */ [4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x6B, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), [4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST_QUAD + 1] = FLEXSPI_LUT_SEQ( kFLEXSPI_Command_DUMMY_SDR, kFLEXSPI_4PAD, 0x08, kFLEXSPI_Command_READ_SDR, kFLEXSPI_4PAD, 0x04), /* Fast read mode - SDR */ [4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x0B, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), [4 * NOR_CMD_LUT_SEQ_IDX_READ_FAST + 1] = FLEXSPI_LUT_SEQ( kFLEXSPI_Command_DUMMY_SDR, kFLEXSPI_1PAD, 0x08, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04), /* Normal read mode -SDR */ [4 * NOR_CMD_LUT_SEQ_IDX_READ_NORMAL] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x03, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), [4 * NOR_CMD_LUT_SEQ_IDX_READ_NORMAL + 1] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Read extend parameters */ [4 * NOR_CMD_LUT_SEQ_IDX_READSTATUS] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x81, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04), /* Write Enable */ [4 * NOR_CMD_LUT_SEQ_IDX_WRITEENABLE] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x06, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Erase Sector */ [4 * NOR_CMD_LUT_SEQ_IDX_ERASESECTOR] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x20, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), /* Page Program - single mode */ [4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_SINGLE] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x02, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), [4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_SINGLE + 1] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_1PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Page Program - quad mode */ [4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_QUAD] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x32, kFLEXSPI_Command_RADDR_SDR, kFLEXSPI_1PAD, 0x18), [4 * NOR_CMD_LUT_SEQ_IDX_PAGEPROGRAM_QUAD + 1] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_4PAD, 0x04, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Read ID */ [4 * NOR_CMD_LUT_SEQ_IDX_READID] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x9F, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04), /* Enable Quad mode */ [4 * NOR_CMD_LUT_SEQ_IDX_WRITESTATUSREG] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x01, kFLEXSPI_Command_WRITE_SDR, kFLEXSPI_1PAD, 0x04), /* Enter QPI mode */ [4 * NOR_CMD_LUT_SEQ_IDX_ENTERQPI] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x35, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Exit QPI mode */ [4 * NOR_CMD_LUT_SEQ_IDX_EXITQPI] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_4PAD, 0xF5, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), /* Read status register */ [4 * NOR_CMD_LUT_SEQ_IDX_READSTATUSREG] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0x05, kFLEXSPI_Command_READ_SDR, kFLEXSPI_1PAD, 0x04), /* Erase whole chip */ [4 * NOR_CMD_LUT_SEQ_IDX_ERASECHIP] = FLEXSPI_LUT_SEQ(kFLEXSPI_Command_SDR, kFLEXSPI_1PAD, 0xC7, kFLEXSPI_Command_STOP, kFLEXSPI_1PAD, 0), }; ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ These are all the changes that you need to make to the flexspi_nor_polling_transfer to use the on-chip flash as NVM instead of the external NOR flash.  Important notes It's important to mention that you cannot write, erase or read the on-chip flash while making XiP from this memory. So, you need to reallocate all the instructions that read, write and erase the flash to the internal FlexRAM. Fortunately, the example that we use as a base (flexspi_nor_polling_transfer) already does this. This is accomplished thanks to the files located in the linkscritps folder. To learn more about how this works, please refer to section 17.15 of the MCUXpresso IDE User Guide.  Additional resources Datasheet of the on-chip memory (Winbond W25Q32JV).  MCUXpresso IDE User Guide.  I hope that you find this document useful!  Regards,  Victor 
View full article
In the SDK_2.7.0_EVKB-IMXRT1050, it contains some eIQ machine learning demo projects, there's the tensorflow_lite_kws among them. It's a keyword spotting example that is based on Keyword spotting for Microcontrollers and it deploys a deepwise separable convolutional neural network called MobileNet in this demo project. It can classify a one-second audio clip as either silence, an unknown word, "yes", "no", "up", "down", "left", "right", "on", "off", "stop", or "go". Figure 1 shows the components that comprise it. Fig 1 Training Our New Model The model we are using is trained with the TensorFlow script which is designed to demonstrate how to build and train a model for audio recognition using TensorFlow. The script makes it very easy to train an audio recognition model. Among other things, it allows us to do the following: Download a dataset with audio featuring 20 spoken words. Choose which subset of words to train the model on. Specify what type of preprocessing to use on the audio. Choose from several different types of the model architecture. Optimize the model for microcontrollers using quantization. When we run the script, it downloads the dataset, trains a model, and outputs a file representing the trained model. We then use some other tools to convert this file into the correct form for TensorFlow Lite. Training in virtual machine (VM) Preparation Make sure the TensorFlow has been installed, and since the script downloads over 2GB of training data, it'll need a good internet connection and enough free space on the machine. Note that: The training process itself can take several hours, be patient. Training To begin the training process, use the following commands to clone ML-KWS-for-MCU. git clone https://github.com/ARM-software/ML-KWS-for-MCU.git‍‍‍‍‍‍ The training scripts are configured via a bunch of command-line flags that control everything from the model’s architecture to the words it will be trained to classify. The following command runs the script that begins training. You can see that it has a lot of command-line arguments: python ML-KWS-for-MCU/train.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --wanted_words=zero, one, two, three, four, five, six, seven, eight, nine \ --dct_coefficient_count 10 --window_size_ms 40 \ --window_stride_ms 20 --learning_rate 0.0005,0.0001,0.00002 \ --how_many_training_steps 10000,10000,10000 \ --data_dir=./speech_dataset --summaries_dir ./retrain_logs --train_dir ./speech_commands_train ‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ Some of these, like --wanted_words=zero, one, two, three, four, five, six, seven, eight, nine. By default, the selected words are yes, no, up, down, left, right, on, off, stop, go, but we can provide any combination of the following words, all of which appear in our dataset: Common commands: yes, no, up, down, left, right, on, off, stop, go, backward, forward, follow, learn Digits zero through nine: zero, one, two, three, four, five, six, seven, eight, nine Random words: bed, bird, cat, dog, happy, house, Marvin, Sheila, tree, wow Others set up the output of the script, such as --train_dir=/content/speech_commands_train, which defines where the trained model will be saved. Leave the arguments as they are, and run it. The script will start off by downloading the Speech Commands dataset (Figure 2), which consists of over 105,000 WAVE audio files of people saying thirty different words. This data was collected by Google and released under a CC BY license, and you can help improve it by contributing five minutes of your own voice. The archive is over 2GB, so this part may take a while, but you should see progress logs, and once it's been downloaded once you won't need to do this step again. You can find more information about this dataset in this Speech Commands paper. Fig 2 Once the downloading has completed, some more output will appear. There might be some warnings, which you can ignore as long as the command continues running. Later, you'll see logging information that looks like this (Figure 3). Fig 3 This shows that the initialization process is done and the training loop has begun. You'll see that it outputs information for every training step. Here's a break down of what it means: Step shows that we're on the step of the training loop. In this case, there are going to be 30,000 steps in total, so you can look at the step number to get an idea of how close it is to finishing. rate is the learning rate that's controlling the speed of the network's weight updates. Early on this is a comparatively high number (0.0005), but for later training cycles it will be reduced 5x, to 0.0001, then to 0.00002 at last. accuracy is how many classes were correctly predicted on this training step. This value will often fluctuate a lot, but should increase on average as training progresses. The model outputs an array of numbers, one for each label, and each number is the predicted likelihood of the input being that class. The predicted label is picked by choosing the entry with the highest score. The scores are always between zero and one, with higher values representing more confidence in the result. cross-entropy is the result of the loss function that we're using to guide the training process. This is a score that's obtained by comparing the vector of scores from the current training run to the correct labels, and this should trend downwards during training. checkpoint After a hundred steps, you should see a line like this: This is saving out the current trained weights to a checkpoint file (Figure 4). If your training script gets interrupted, you can look for the last saved checkpoint and then restart the script with --start_checkpoint=/tmp/speech_commands_train/best/ds_cnn_xxxx.ckpt-400 as a command line argument to start from that point . Fig 4 Confusion Matrix After four hundred steps, this information will be logged: The first section is a confusion matrix. To understand what it means, you first need to know the labels being used, which in this case are "silence", "unknown", "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", and "nine". Each column represents a set of samples that were predicted to be each label, so the first column represents all the clips that were predicted to be silence, the second all those that were predicted to be unknown words, the third "zero", and so on. Each row represents clips by their correct, ground truth labels. The first row is all the clips that were silence, the second clips that were unknown words, the third "zero", etc. This matrix can be more useful than just a single accuracy score because it gives a good summary of what mistakes the network is making. In this example you can see that all of the entries in the first row are zero (Figure 5), apart from the initial one. Because the first row is all the clips that are actually silence, this means that none of them were mistakenly labeled as words, so we have no false negatives for silence. This shows the network is already getting pretty good at distinguishing silence from words. If we look down the first column though, we see a lot of non-zero values. The column represents all the clips that were predicted to be silence, so positive numbers outside of the first cell are errors. This means that some clips of real spoken words are actually being predicted to be silence, so we do have quite a few false positives. A perfect model would produce a confusion matrix where all of the entries were zero apart from a diagonal line through the center. Spotting deviations from that pattern can help you figure out how the model is most easily confused, and once you've identified the problems you can address them by adding more data or cleaning up categories.                                                            Fig 5                                                             Validation After the confusion matrix, you should see a line like Figure 5 shows. It's good practice to separate your data set into three categories. The largest (in this case roughly 80% of the data) is used for training the network, a smaller set (10% here, known as "validation") is reserved for evaluation of the accuracy during training, and another set (the last 10%, "testing") is used to evaluate the accuracy once after the training is complete. The reason for this split is that there's always a danger that networks will start memorizing their inputs during training. By keeping the validation set separate, you can ensure that the model works with data it's never seen before. The testing set is an additional safeguard to make sure that you haven't just been tweaking your model in a way that happens to work for both the training and validation sets, but not a broader range of inputs. The training script automatically separates the data set into these three categories, and the logging line above shows the accuracy of model when run on the validation set. Ideally, this should stick fairly close to the training accuracy. If the training accuracy increases but the validation doesn't, that's a sign that overfitting is occurring, and your model is only learning things about the training clips, not broader patterns that generalize Training Finished In general, training is the process of iteratively tweaking a model’s weights and biases until it produces useful predictions. The training script writes these weights and biases to checkpoint files (Figure 6). Fig 6 A TensorFlow model consists of two main things: The weights and biases resulting from training A graph of operations that combine the model’s input with these weights and biases to produce the model’s output At this juncture, our model’s operations are defined in the Python scripts, and its trained weights and biases are in the most recent checkpoint file. We need to unite the two into a single model file with a specific format, which we can use to run inference. The process of creating this model file is called freezing—we’re creating a static representation of the graph with the weights frozen into it. To freeze our model, we run a script that is called as follows: python ML-KWS-for-MCU/freeze.py --model_architecture ds_cnn --model_size_info 5 64 10 4 2 2 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 64 3 3 1 1 \ --wanted_words=zero, one, two, three, four, five, six, seven, eight, nine \ --dct_coefficient_count 10 --window_size_ms 40 \ --window_stride_ms 20 --checkpoint ./speech_commands_train/best/ds_cnn_9490.ckpt-21600 \ --output_file=./ds_cnn.pb‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ To point the script toward the correct graph of operations to freeze, we pass some of the same arguments we used in training. We also pass a path to the final checkpoint file, which is the one whose filename ends with the total number of training steps. The frozen graph will be output to a file named ds_cnn.pb. This file is the fully trained TensorFlow model. It can be loaded by TensorFlow and used to run inference. That’s great, but it’s still in the format used by regular TensorFlow, not TensorFlow Lite. Convert to TensorFlow Lite Conversion is a easy step: we just need to run a single command. Now that we have a frozen graph file to work with, we’ll be using toco, the command-line interface for the TensorFlow Lite converter. toco --graph_def_file=./ds_cnn.pb --output_file=./ds_cnn.tflite \ --input_shapes=1,49,10,1 --input_arrays=Reshape_1 --output_arrays='labels_softmax' \ --inference_type=QUANTIZED_UINT8 --mean_values=227 --std_dev_values=1 \ --change_concat_input_ranges=false \ --default_ranges_min=-6 --default_ranges_max =6‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ In the arguments, we specify the model that we want to convert, the output location for the TensorFlow Lite model file, and some other values that depend on the model architecture. we also provide some arguments (inference_type, mean_values, and std_dev_values) that instruct the converter how to map its low-precision values into real numbers. The converted model will be written to ds_cnn.tflite, this a fully formed TensorFlow Lite model! Create a C array We’ll use the xxd command to convert a TensorFlow Lite model into a C array in the following. xxd -i ./ds_cnn.tflite > ./ds_cnn.h cat ./ds_cnn.h‍‍‍‍‍‍‍‍ The final part of the output is the file’s contents, which are a C array and an integer holding its length, as follows: Fig 7 Next, we’ll integrate this newly trained model with the tensorflow_lite_kws project. Using the Model in tensorflow_lite_kws Project To use the new model, we need to do two things: In source/ds_cnn_s_model.h, replace the original model data with our new model. Update the label names in source/kws.cpp with our new ''zero'', ''one'', ''two'', ''three'', ''four'', ''five'', ''six'', ''seven'', ''eight'' and ''nine'' labels. const std::string labels[] = {"Silence", "Unknown","zero", "one", "two", "three","four", "five", "six", "seven","eight", "nine"};‍‍‍ Before running the model in the EVKB-IMXRT1050 board (Figure 8), please refer to the readme.txt to do the preparation, in further, the file also demonstrates the steps of testing, please follow them. Fig 8 Figure 9 shows the testing I did, I've attached the model file, please give a try by yourself. Fig 9
View full article
1 Introduction RT1060 MCUXpresso SDK provide the ota_bootloader project, download link: https://mcuxpresso.nxp.com/en/welcome ota_bootloader path: SDK_2_10_0_EVK-MIMXRT1060\boards\evkmimxrt1060\bootloader_examples\ota_bootloader  this ota_bootloader can let the customer realize the on board app update, the RT1060 OTA bootloader mainly have the two functions: ISP method update APP Provide the API to the customer, can realize the different APP swap and rollback function ISP method is from the flashloader, the flashloader put the code in the internal RAM, when use it, normally RT chip need to enter the serial download mode, and use the sdphost download the flashloader to the internal RAM, then run it to realize the ISP function. But OTA_bootloader will put the code in the flash directly, RT chip can run it in the internal boot mode directly, each time after chip reset, the code will run the ota_bootloader at first, this time, customer can use the blhost to communicate the RT chip with UART/USB HID directly to download the APP code. So, to the ISP function in the OTA_bootloader, customer also can use it as the ISP secondary bootloader to update the APP directly. Then reset after the 5 seconds timeout, if the APP area is valid, code will jump to the APP and run APP. To the ota_bootloader swap and rollback function, customer can use the provided API in the APP directly to realize the APP update, run the different APP, swap and rollback to the old APP.    This document will give the details about how to use the ota_bootloader ISP function to update the APP, how to modify the customer application to match the ota_bootloader, how to resolve the customer APP issues which need to use the SDRAM with the ota_bootloader, how to use the ota_bootloader API to realize the swap and rollback function, and the related APP prepare. 2 OTA bootloader ISP usage Some customers need the ISP secondary bootloader function, as the flashloader need to put in the internal RAM, so customer can use the ota_bootloader ISP function realize the secondary bootloader requirement, to this method, customer mainly need to note two points: 1) APP need to add the ota_header to meet the ota_bootloader demand. 2) APP use the SDRAM, ota_bootloader need to add DCD memory  2.1 application modification OTA_bootloader located in the flash from 0x60000000, APP locate from 0x60040000. We can find this from the ota_bootloader bootloader_config.h file: #define BL_APP_VECTOR_TABLE_ADDRESS (0x60040000u) From 0x60040000, the first 0X400 should put the ota_header, then the real APP code is put from 0x60040400. Now, take the SDK evkmimxrt1060_iled_blinky project as an example, to modify it to match the ota_bootloader. Application code need to add these files: ota_bootloader_hdr.c, ota_bootloader_board.h, ota_bootloader_supp.c, ota_bootloader_supp.h These files can be found from the evkmimxrt1060_lwip_httpssrv_ota project, put the above files in the led_blinky source folder. 2.1.1 memory modification    APP memory start address modify from 0x60000000 to 0x60040000, which is the ota_bootloader defined address。 Fig 1 2.1.2 ota_hdr related files add   In the led_blinky project source folder, add the mentioned ota_bootloader related files. Fig 2 2.1.3 linker file modification In the evkmimxrt1060_iled_blinky_debug.ld, add boot_hdr, length is 0X400. We can use the MCUXpresso IDE linkscripts folder, add the .ldt files to modify the linker file. Here, in the project, add one linkscripts folder, add the boot_hdr_MIMXRT1060.ldt file, build . Fig 3 After the build, we can find the ld file already add the ota_header in the first 0x400 range. Fig 4 The above is the generated evkmimxrt1060_iled_blinky_ota_0x60040000.bin file, we can find already contains the boot_hdr. From offset 0x400, will put the real APP code. 2.2 ISP test related command Test board is MIMXRT1060-EVK, download the evkmimxrt1060_ota_bootloader project at first, can use the mcuxpresso IDE download it directly, then press the reset button, put the blhost and evkmimxrt1060_iled_blinky_ota_0x60040000.bin in the same folder, use the following command:   blhost.exe -t 50000 -u 0x15a2,0x0073 -j -- get-property 1 0 blhost.exe -t 2048000 -u 0x15a2,0x0073 -j -- flash-erase-region 0x60040000 0x6000 9 blhost.exe -t 5242000 -u 0x15a2,0x0073 -j -- write-memory 0x60040000 evkmimxrt1060_iled_blinky_ota_0x60040000.bin blhost.exe -t 5242000 -u 0x15a2,0x0073 -j -- read-memory 0x60040000 0x6000 flexspiNorCfg.dat 9 Fig 5 After the download is finished, press the reset, wait 5 seconds, it will jump to the app, we can find the MIMXRT1060-EVK on board led is blinking. This method realize the flash ISP bootloader downloading and the app working. The above command is using the USB HID to download, if need to use the UART, also can use command like this: blhost.exe -t 50000 -p COM45,19200 -j -- get-property 1 0 blhost.exe -t 50000 -p COM45,19200 -j -- flash-erase-region 0x60040000 0x6000 9 blhost.exe -t 50000 -p COM45,19200 -j -- write-memory 0x60040000 evkmimxrt1060_iled_blinky_ota_0x60040000.bin blhost.exe -t 50000 -p COM45,19200 -j -- read-memory 0x60040000 0x6000 flexspiNorCfg.dat 9 To the UART communication, as the ota_bootloader auto baud detect issues, it’s better to use the baudrate not larger than 19200bps, or in the ota_bootloader code, define the fixed baudrate, eg, 115200. 2.3 Bootloader DCD consideration Some customer use the ota_bootloader to associate with their own application, which is used the external SDRAM, and find although the downloading works, but after boot, the app didn’t run successfully. In the APP code, it add the DCD configuration, but as the ota_bootloader already put in the front of the QSPI, app will use the ota_hdr, which will delete the dcd part, so, to this situation, customer can add the dcd in the ota_bootloader. The SDK ota_bootloader didn’t add the dcd part in default. Now, modify the ota_bootloader at first, then prepare the sdram app and test it. 2.3.1 Ota_bootloader add dcd 2.3.1.1 add dcd files   ota_bootloader add dcd.c dcd.h, these two files can be found from the SDK hello_world project. Copy these two files to the ota_bootloader board folder. From the dcd.c code, we can see, dcd_data[] array is put in the “.boot_hdr.dcd_data” area, and it need to define the preprocessor: XIP_BOOT_HEADER_DCD_ENABLE=1 2.3.1.2 add dcd linker code   Modify the mcuxpresso IDE ld files, and add the dcd range.    .ivt : AT(ivt_begin) { . = 0x0000 ; KEEP(* (.boot_hdr.ivt)) /* ivt section */ . = 0x0020 ; KEEP(* (.boot_hdr.boot_data)) /* boot section */ . = 0x0030 ; KEEP(*(.boot_hdr.dcd_data)) __boot_hdr_end__ = ABSOLUTE(.) ; . = 0x1000 ; } > m_ivt Fig 6 0x60001000 : IVT 0x6000100c : DCD entry point 0x60001020 : boot data 0x60001030 : DCD detail data 2.3.1.3 add IVT dcd entry point   ota_bootloader->MIMXRT1062 folder->hardware_init_MIMXRT1062.c,modify the image_vector_table, fill the DCD address to the dcdc_data array address as the dcd entry point. Fig 7 Until now, we finish the ota_bootloader dcd add, build the project, generate the image, we can find, DCD already be added to the ota_bootloader image. Fig 8 DCD entry point in the IVT and the DCD data is correct, we can burn this modified ota_bootloader to the MIMXRT1060-EVK board. 2.3.2 SDRAM app prepare Still based on the evkmimxrt1060_iled_blinky project, just put some function in the SDRAM. From memory, we can find, the SRAM is in the RAM4: Fig 9 In the led_blinky.c, add this header: #include <cr_section_macros.h> Then, put the systick delay code to the RAM4 which is the SDRAM area. __RAMFUNC(RAM4) void SysTick_DelayTicks(uint32_t n) { g_systickCounter = n; while (g_systickCounter != 0U) { } } Now, we already finish the simple SDRAM app, we also can test it, and put the breakpoint in the SysTick_DelayTicks, we can find the address is also the SDRAM related address, generate the evkmimxrt1060_iled_blinky1_SDRAM_0x60040000.bin. If use the old ota_bootloader to download this .bin, we can find the led is not blinking. 2.3.3 Test result Command refer to chapter 2.2 ISP test related command, download the evkmimxrt1060_iled_blinky1_SDRAM_0x60040000.bin with the new modified ota_bootloader project, we can find, after reset, the led will blink. So the SDRAM app works with the modified ota_bootloader. 3 OTA bootloader swap and rollback OTA bootloader can realize the swap and rollback function, this part will test the ota_bootloader swap and rollback, prepare two apps and download to the different partition area, then use the UART input char to select the swap or rollback function, to check the which app is running. 3.1  memory map ota bootloader memory information. Fig 10 The above map is based on the external 8Mbyte QSPI flash. OTA bootloader: RT SDK ota bootloader code Boot meta 0: contains 3 partition start address, size etc. information, ISP peripheral information. Boot meta 1: contains 3 partition start address, size etc. information, ISP peripheral information. Swap meta 0: bootloader will use meta data to do the swap operation Swap meta 1: bootloader will use meta data to do the swap operation Partition 1: APP1 location Partition 2: APP2 location Scratch part: APP1 backup location, start point is before 0x60441000, which is enough to put APP1 and multiple sector size, eg, APP1 is 0X5410, sector size is 0x1000, APP1 need 6 sectors, so the scratch start address is 0x60441000-0x6000=0x43b000. User data: user used data area 3.2 swap and rollback basic Fig11  The APP1 and APP2 put in the partition1 and partition2 need to contains the ota_header which meet the bootloader demand, bootloader will check the APP CRC, if it passed, then will boot the app, otherwise, it will enter the ISP mode. Partition2 image need to has the valid header, otherwise, swap will be failed.   Swap function will erase partition2 scratch area, then put the partition1 code to the scratch, erase the partition 1 position, and write the partition 2 image to the partition1 position.   Rollback function will run the previous APP1, erase the partition 2 position, copy the parititon 1 image back to the partition 2, erase partition 1 position, copy partition 2 scratch image back to partition 1. 3.2.1 boot meta boot_meta 0: 0x0x6003c000 size: 0x20c boot_meta 1: 0x0x6003d000 size: 0x20c   OTA bootloader can read boot meta from the two different address, when the two address meta are valid(tag is 'B', 'L', 'M', 'T'), bootloader will choose the bigger version meta. If both meta is not valid, bootloader will copy default boot meta data to the boot meta 0 address.   Boot meta contains 3 partition start address, size information. SDK demo can call bootloader API to find the partition information, then do the image program.   Boot meta also contains the ISP peripheral information, the timeout(5s) information, the structure is: //!@brief Partition information table definitions typedef struct { uint32_t start; //!< Start address of the partition uint32_t size; //!< Size of the partition uint32_t image_state; //!< Active/ReadyForTest/UnderTest uint32_t attribute; //!< Partition Attribute - Defined for futher use uint32_t reserved[12]; //!< Reserved for future use } partition_t; //!@brief Bootloader meta data structure typedef struct { struct { uint32_t wdTimeout; uint32_t periphDetectTimeout; uint32_t enabledPeripherals; uint32_t reserved[12]; } features; partition_t partition[kPartition_Max];//16*4*7 bytes uint32_t meta_version; uint32_t patition_entries; uint32_t reserved0; uint32_t tag; } bootloader_meta_t;   3.2.2 swap meta Swap meta 0 : 0x6003e000, size 0x50 Swap meta 1 : 0x6003f000, size 0x50    OTA bootloader will read the swap meta from 2 different place, if it is not valid, bootloader will set the default data to swap meta 0(0x6003e000). If both image are valid, bootloader will choose the bigger version meta.    Bootloader will refer to the meta data to do the swap operation, sometimes, after reset, meta data will be modified automatically. It mainly relay on the swap_type: kSwapType_ReadyForTest :After reset, do swap operation. Modify meta swap_type to kSwapType_Test. Reset, as the meta data is kSwapType_Test, bootloader can do the rollback. kSwapType_Test : After reset, do rollback after operation, swap_type change to kSwapType_None.  kSwapType_Rollback bootloader will write kSwapType_Test to the meta, after reset, bootloader will refer to kSwapType_Test to do the operation.  kSwapType_Permanent After reset, modify meta data to kSwapType_Permanent, then the APP will boot with partition 1. Swap structure: //!@brief Swap progress definitions typedef struct { uint32_t swap_offset; //!< Current swap offset uint32_t scratch_size; //!< The scratch area size during current swapping uint32_t swap_status; // 1 : A -> B scratch, 2 : B -> A uint32_t remaining_size; //!< Remaining size to be swapped } swap_progress_t; typedef struct { uint32_t size; uint32_t active_flag; } image_info_t; //!@brief Swap meta information, used for the swapping operation typedef struct { image_info_t image_info[2]; //!< Image info table #if !defined(BL_FEATURE_HARDWARE_SWAP_SUPPORTED) || (BL_FEATURE_HARDWARE_SWAP_SUPPORTED == 0) swap_progress_t swap_progress; //!< Swap progress #endif uint32_t swap_type; //!< Swap type uint32_t copy_status; //!< Copy status uint32_t confirm_info; //!< Confirm Information uint32_t meta_version; //!< Meta version uint32_t reserved[7]; //!< Reserved for future use uint32_t tag; } swap_meta_t; 3.3 Common used API Bootloader provide the API for the customer to use it, the common used API are: 3.3.1 update_image_state   update swap meta data, before update, it will check the partition1 image valid or not, if image is not valid, swap meta data will not be updated, and return failure. 3.3.2 get_update_partition_info   get partition information, then define the image program address. 3.3.3 get_image_state   get the current boot image status. None/permanent/UnderTest 3.4  Swap rollback APP prepare Prepare two APPs: APP1 and APP2 bin file, and use the USB HID download to the partition1 and paritition 2. After reset, run APP1 in default, then use the COM input to select the swap and rollback function. Code is: int main(void) { char ch; status_t status; /* Board pin init */ BOARD_InitPins(); BOARD_InitBootClocks(); /* Update the core clock */ SystemCoreClockUpdate(); BOARD_InitDebugConsole(); PRINTF("\r\n------------------hello world + led blinky demo 2.------------------\r\n"); PRINTF("\r\nOTA bootloader test...\r\n" "1 - ReadyForTest\r\n" "3 - kSwapType_Permanent\r\n" "4 - kSwapType_Rollback\r\n" "5 - show image state\r\n" "6 - led blinking for 5times\r\n" "r - NVIC reset\r\n"); // show swap state in swap meta get_image_swap_state(); /* Set systick reload value to generate 1ms interrupt */ if (SysTick_Config(SystemCoreClock / 1000U)) { while (1) { } } while(1) { ch = GETCHAR(); switch(ch) { case '1': status = bl_update_image_state(kSwapType_ReadyForTest); PRINTF("update_image_state to kSwapType_ReadyForTest status: %i\n", status); if (status != 0) PRINTF("update_image_state(kSwapType_ReadyForTest): failed\n"); else NVIC_SystemReset(); break; case '3': status = bl_update_image_state(kSwapType_Permanent); PRINTF("update_image_state to kSwapType_Permanent status: %i\n", status); if (status != 0) PRINTF("update_image_state(kSwapType_Permanent): failed\n"); else NVIC_SystemReset(); break; case '4': status = bl_update_image_state(4); // PRINTF("update_image_state to kSwapType_Rollback status: %i\n", status); if (status != 0) PRINTF("update_image_state(kSwapType_Rollback): failed\n"); else NVIC_SystemReset(); break; case '5': // show swap state in swap meta get_image_swap_state(); break; case '6': Led_blink10times(); break; case 'r': NVIC_SystemReset(); break; } } } When download the APP, need to use the correct ota_header in the first 0x400 area, otherwise, swap will failed. const boot_image_header_t ota_header = { .tag = IMG_HDR_TAG, .load_addr = ((uint32_t)&ota_header) + BL_IMG_HEADER_SIZE, .image_type = IMG_TYPE_XIP, .image_size = 0, .algorithm = IMG_CHK_ALG_CRC32, .header_size = BL_IMG_HEADER_SIZE, .image_version = 0, .checksum = {0xFFFFFFFF}, }; This is a correct sample: Fig 12 Image size and checksum need to use the real APP image information, in the attached file, provide one image_header_padding.exe, it can input the none ota header image, then it will output the whole image which add the ota_header contains the image size and the image crc data in the first 0x400 range. 3.5 Test steps and result Prepare the none header APP1 evkmimxrt1060_APP1_0X60040400.bin, APP2 image evkmimxrt1060_APP2_0X60240400.bin, and put blhost.exe, image_header_padding.exe in the same folder. APP1 and APP2 just the printf version is different, one is version1, another is version2. Printf result: hello world + led blinky demo 1 hello world + led blinky demo 2 Attached OTAtest folder is used for test, but need to download the blhost.exe from the below link, and copy the blhost.exe to the OTAtest folder. https://www.nxp.com/webapp/sps/download/license.jsp?colCode=blhost_2.6.6&appType=file1&DOWNLOAD_ID=null Then run the following bat command: image_header_padding.exe evkmimxrt1060_APP1_0X60040400.bin 0x60040400 sleep 20 blhost.exe -t 50000 -u 0x15a2,0x0073 -j -- get-property 1 0 sleep 20 blhost.exe -u -t 1000000 -- flash-erase-region 0x6003c000 0x4000 sleep 50 blhost -u -t 5000 -- flash-erase-region 0x60040000 0x10000 sleep 50 blhost -u -t 5000 -- write-memory 0x60040000 boot_img_crc32.bin sleep 100 image_header_padding.exe evkmimxrt1060_APP2_0X60240400.bin 0x60040400 sleep 20 blhost -u -t 5000 -- flash-erase-region 0x60240000 0x10000 sleep 50 blhost -u -t 5000 -- flash-erase-region 0x6043b000 0x10000 sleep 50 blhost -u -t 5000 -- write-memory 0x60240000 boot_img_crc32.bin sleep 100 pause The function is to generate the APP1 with the correct ota_header, erase parititon 1,program APP1 to partition 1, generate the APP2 with the correct ota_header, erase partition 2 and scratch area, program APP2 to partition 2, Run the .bat file should in 5 seconds after reset, then use the ISP to connect it: blhost.exe -t 50000 -u 0x15a2,0x0073 -j -- get-property 1 0 ota_bootloader can use the ota_bootloader project download directly to the 0X60000000 area. After downloading, reset the chip, and wait for 5 seconds, the APP will run. This is the test result: Fig 13 From the test result, we can find, in the first time boot, APP1 running, image state: none Input 1, do the swap, will find the APP2 running, image state: undertest Input 3, select permanent, reset will find, still APP2 running, image state: permanent Input 4, choose rollback, reset will find APP1 running, image states: none Until now, finish the swap and rollback function. Input 6, will find the APP contains the SDRAM led blinky is working.          
View full article
This document describes the different source clocks and the main modules that manage which clock source is used to derive the system clocks that exists on the i.MX RT’s devices. It’s important to know the different clock sources available on our devices, modifying the default clock configuration may have different purposes since increasing the processor performance, achieving specific baud rates for serial communications, power saving, or simply getting a known base reference for a clock timer. The hardware used for this document is the following: i.MX RT: EVK-MIMXRT1060 Keep in mind that the described hardware and management clock modules in this document are a general overview of the different platforms and the devices listed above are used as a reference example, some terms and hardware modules functionality may vary between devices of the same platform. For more detailed information about the device hardware modules, please refer to your specific device Reference Manual. RT platforms The Clock Controller Module(CCM) facilitates the clock generation in the RT platforms, many clocking variations are possible and the maximum clock frequency for the i.MX RT1060 device is @600MHz.The following image shows a block diagram of the CCM, the three marked sub-modules are important to understand all the clock path from the clock generation(oscillators or crystals) to the clock management for all the peripherals of the board.    Figure 1. Clock Controller Module(CCM) Block Diagram        CCM Analog Submodule This submodule contains all the oscillators and several PLL’s that provide a clock source to the principal CMM module. For example, the i.MX RT1060 device supports 2 internal oscillators that combined with suitable external quartz crystal and external load capacitors provide an accurate clock source, another 2 internal oscillators are available for low power modes and as a backup when the system detects a loss of clock. These oscillators provide a fixed frequency for the several PLL’s inside this module. Internal Clock Sources with external components  Crystal Oscillator @24MHz Many of the serial IO modules depend on the fixed frequency of 24 MHz. The reference clock that generates this crystal oscillator provides an accurate clock source for all the PLL inputs.  Crystal Oscillator @32KHz Generally, RTC oscillators are either implemented with 32 kHz or 32.768 kHz crystals. This Oscillator should always be active when the chip is powered on. Internal Clock sources RC Oscillator @24MHz A lower-power RC oscillator module is available on-chip as a possible alternative to the 24 MHz crystal oscillator after a successful power-up sequence. The 24 MHz RC oscillator is a self-tuning circuit that will output the programmed frequency value by using the RTC clock as its reference. While the power consumption of this RC oscillator is much lower than the 24MHz crystal oscillator, one limitation of this RC oscillator module is that its clock frequency is not as accurate. Oscillator @32KHz The internal oscillator is automatically multiplexed in the clocking system when the system detects a loss of clock. The internal oscillator will provide clocks to the same on-chip modules as the external 32kHz oscillator. Also is used to be useful for quicker startup times and tampering prevention. Note. An external 32KHz clock source must be used since the internal oscillator is not precise enough for long term timekeeping. PLLs There are 7 PLLs in the i.MXRT1060 platform, some with specific functions, for example, create a reference clock for the ARM Core, USB peripherals, etc. Below these PLLs are listed. PLL1 - ARM PLL (functional frequency @600 MHz) PLL2 - System PLL (functional frequency @528 MHz)* PLL3 - USB1 PLL (functional frequency @480 MHz)* PLL4 - Audio PLL PLL5 - Video PLL PLL6 - ENET PLL PLL7 - USB2 PLL (functional frequency @480 MHz) * Two of these PLLs are each equipped with four Phase Fractional Dividers (PFDs) in order to generate additional frequencies for many clock roots.  Each PLLs configuration and control functions like Bypass, Output Enable/Disable, and Power Down modes are accessible individually through its PFDs and global configuration and status registers found at the CCM internal memory registers.        Clock Control Module(CCM) The Clock Control Module (CCM) generates and controls clocks to the various modules in the design and manages low power modes. This module uses the available clock sources(PLL reference clocks and PFDs) to generate the clock roots. There are two important sub-blocks inside the CCM listed below. Clock Switcher This sub-block provides the registers that control which PLLs and PFDs outputs are selected as the reference clock for the Clock Root Generator.  Clock Root Generator This sub-block provides the registers that control most of the secondary clock source programming, including both the primary clock source selection and the clock dividers. The clock roots are each individual clocks to the core, system buses, and all other SoC peripherals, among those, are serial clocks, baud clocks, and special function blocks. All of these clock references are delivered to the Low Power Clock Gating unit(LPCG).        Low Power Clock Gating unit(LPCG) The LPCG block receives the root clocks from CCM and splits them to clock branches for each peripheral. The clock branches are individually gated clocks. The following image shows a detailed block diagram of the CMM with the previously described submodules and how they link together. Figure 2. Clock Management System Example: Configure The ARM Core Clock (PLL1) to a different frequency. The Clock tools available in MCUXpresso IDE, allows you to understand and configure the clock source for the peripherals in the platform. The following diagram shows the default PLL1 mode configured @600MHz, the yellow path shows all the internal modules involved in the clock configuration.  Figure 3. Default PLL configuration after reset. From the previous image notice that PLL1 is attached from the 24MHz oscillator, then the PLL1 is configured with a pre-scaler of 50 to achieve a frequency @1.2GHz, finally, a frequency divider by 2 let a final frequency @600MHz. 1.1 Modify the PLL1 frequency For example, you can use the Clock tools to configure the PLL pre-scaler to 30, select the PLL1 block and then edit the pre-scaler value, therefore, the final clock frequency is @360MHz, these modifications are shown in the following figure.  Figure 4. PLL1 @720MHz, final frequency @360MHz    1.2 Export clock configuration to the project After you complete the clock configuration, the Clock Tool will update the source code in clock_config.c and clock_config.h, including all the clock functional groups that we created with the tool. This will include the clock source for specific peripherals. In the previous example, we configured the PLL1 (ARM PLL) to a functional frequency @360MHz; this is translated to the following structure in source code: “armPllConfig_BOARD_BootClockRUN” and it’s used by “CLOCK_InitArmPll();” inside the “BOARD_BootClockPLL150MRUN();” function.     Figure 5. PLL1 configuration struct  Figure 6. PLL configuration function Example: The next steps describe how to select a clock source for a specific peripheral. 1.1 Configure clock for specific peripheral For example, using the GPT(General Purpose Timer) the available clock sources are the following: Clock Source Off Peripheral Clock High-Frequency Reference Clock Clock Source from an external pin Low-Frequency Reference Clock Crystal Oscillator Figure 7. General Purpose Timer Clocks Diagram Using the available SDK example project “evkmimxrt1060_gpt_timer” a configuration struct for the peripheral “gptConfig” is called from the main initialization function inside the gpt_timer.c source file, the default configuration function with the configuration struct as a parameter, is shown in the following figure. Figure 8. Function that returns a GPT default configuration parameters The function loads several parameters to the configuration struct(gptConfig), one of the fields is the Clock Source configuration, modifying this field will let us select an appropriate clock source for our application, the following figure shows the default configuration parameters inside the “GPT_GetDefaultConfig();” function.  Figure 9. Configuration struct In the default GPT configuration struct, the Peripheral Clock(kGPT_CLockSource_Periph) is selected, the SDK comes with several macros located at “fsl_gpt.h” header file, that helps to select an appropriate clock source. The next figure shows an enumerated type of data that contains the possible clock sources for the GPT.  Figure 10. Available clock sources of the GPT. For example, to select the Low-Frequency Reference Clock the source code looks like the following figure.  Figure 11. Low-Frequency Reference Clock attached to GPT Notice that all the peripherals come with a specific configuration struct and from that struct fields the default clocking parameters can be modified to fit with our timing requirements. 1.2 Modify the Peripheral Clock frequency from Clock Tools One of the GPT clock sources is the “Peripheral Clock Source” this clock line can be modified from the Clock Tools, the following figure shows the default frequency configuration from Clock Tools view. Figure 12. GPT Clock Root inside CMM In the previous figure, the GPT clock line is @75MHz, notice that this is sourced from the primary peripheral clock line that is @600MHz attached to the ARM core clocks. For example, modify the PERCLK_PODF divider selecting it and changing the divider value to 4, the resulting frequency is @37.5Mhz, the following figure illustrates these changes.  Figure 13. GPT & PIT clock line @37.5MHz 1.3 Export clock configuration to the project After you complete the clock configuration, the Clock Tool will update the source code in clock_config.c and clock_config.h, including all the clock functional groups that we created with the tool. This will include the clock source for specific peripherals. In the previous example, we configured the GPT clock root divider by a dividing factor of 4 to achieve a 37.5MHz frequency; this is translated to the following instruction in source code: “CLOCK_SetDiv(kCLOCK_PerclkDiv,3);” inside the “BOARD_BootClockRUN();” function.                Figure 14. Frequency divider function References i.MX RT1060 Processor Reference Manual Also visit LPC's System Clocks  Kinetis System Clocks
View full article
As we know, the RT series MCUs support the XIP (Execute in place) mode and benefit from saving the number of pins, serial NOR Flash is most commonly used, as the FlexSPI module can high efficient fetch the code and data from the Serial NOR flash for Cortex-M7 to execute. The fetch way is implementing via utilizing the Quad IO Fast Read command, meanwhile, the serail NOR flash works in the SDR (Single Data transfer Rate) mode, it receives data on SCLK rise edge and transmits data on SCLK fall edge. Comparing to the SDR mode, the DDR (Dual Data transfer Rate) mode has a higher throughput capacity, whether it can provide better performance of XIP mode, and how to do that if we want the Serial NOR Flash to work in DDR (Dual Data transfer Rate) mode? SDR & DDR mode SDR mode: In SDR (Single Data transfer Rate) mode, data is only clocked on one edge of the clock (either the rising or falling edge). This means that for SDR to have data being transmitted at X Mbps, the clock bit rate needs to be 2X Mbps. DDR mode: For DDR (Dual Data transfer Rate) mode, also known as DTR (Dual Transfer Rate) mode, data is transferred on both the rising and falling edge of the clock. This means data is transmitted at X Mbps only requires the clock bit rate to be X Mbps, hence doubling the bandwidth (as Fig 1 shows).   Fig 1 Enable DDR mode The below steps illustrate how to make the i.MX RT1060 boot from the QSPI with working in DDR mode. Note: The board is MIMXRT1060, IDE is MCUXpresso IDE Open a hello_world as the template Modify the FDCB(Flash Device Configuration Block) a)Set the controllerMiscOption parameter to supports DDR read command. b) Set Serial Flash frequency to 60 MHz. c)Parase the DDR read command into command sequence. The following table shows a template command sequence of DDR Quad IO FAST READ instruction and it's almost matching with the FRQDTR (Fast Read Quad IO DTR) Sequence of IS25WP064 (as Fig 2 shows).   Fig2 FRQDTR Sequence d)Adjust the dummy cycles. The dummy cycles should match with the specific serial clock frequency and the default dummy cycles of the FRQDTR sequence command is 6 (as the below table shows).   However, when the serial clock frequency is 60MHz, the dummy cycle should change to 4 (as the below table shows).   So it needs to configure [P6:P3] bits of the Read Register (as the below table shows) via adding the SET READ PARAMETERS command sequence(as Fig 3 shows) in FDCB manually. Fig 3 SET READ PARAMETERS command sequence In further, in DDR mode, the SCLK cycle is double the serial root clock cycle. The operand value should be set as 2N, 2N-1 or 2*N+1 depending on how the dummy cycles defined in the device datasheet. In the end, we can get an adjusted FCDB like below. // Set Dummy Cycles #define FLASH_DUMMY_CYCLES 8 // Set Read register command sequence's Index in LUT table #define CMD_LUT_SEQ_IDX_SET_READ_PARAM 7 // Read,Read Status,Write Enable command sequences' Index in LUT table #define CMD_LUT_SEQ_IDX_READ 0 #define CMD_LUT_SEQ_IDX_READSTATUS 1 #define CMD_LUT_SEQ_IDX_WRITEENABLE 3 const flexspi_nor_config_t qspiflash_config = { .memConfig = { .tag = FLEXSPI_CFG_BLK_TAG, .version = FLEXSPI_CFG_BLK_VERSION, .readSampleClksrc=kFlexSPIReadSampleClk_LoopbackFromDqsPad, .csHoldTime = 3u, .csSetupTime = 3u, // Enable DDR mode .controllerMiscOption = kFlexSpiMiscOffset_DdrModeEnable | kFlexSpiMiscOffset_SafeConfigFreqEnable, .sflashPadType = kSerialFlash_4Pads, //.serialClkFreq = kFlexSpiSerialClk_100MHz, .serialClkFreq = kFlexSpiSerialClk_60MHz, .sflashA1Size = 8u * 1024u * 1024u, // Enable Flash register configuration .configCmdEnable = 1u, .configModeType[0] = kDeviceConfigCmdType_Generic, .configCmdSeqs[0] = { .seqNum = 1, .seqId = CMD_LUT_SEQ_IDX_SET_READ_PARAM, .reserved = 0, }, .lookupTable = { // Read LUTs [4*CMD_LUT_SEQ_IDX_READ] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0xED, RADDR_DDR, FLEXSPI_4PAD, 0x18), // The MODE8_DDR subsequence costs 2 cycles that is part of the whole dummy cycles [4*CMD_LUT_SEQ_IDX_READ + 1] = FLEXSPI_LUT_SEQ(MODE8_DDR, FLEXSPI_4PAD, 0x00, DUMMY_DDR, FLEXSPI_4PAD, FLASH_DUMMY_CYCLES-2), [4*CMD_LUT_SEQ_IDX_READ + 2] = FLEXSPI_LUT_SEQ(READ_DDR, FLEXSPI_4PAD, 0x04, STOP, FLEXSPI_1PAD, 0x00), // READ STATUS REGISTER [4*CMD_LUT_SEQ_IDX_READSTATUS] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x05, READ_SDR, FLEXSPI_1PAD, 0x01), [4*CMD_LUT_SEQ_IDX_READSTATUS + 1] = FLEXSPI_LUT_SEQ(STOP, FLEXSPI_1PAD, 0x00, 0, 0, 0), // WRTIE ENABLE [4*CMD_LUT_SEQ_IDX_WRITEENABLE] = FLEXSPI_LUT_SEQ(CMD_SDR,FLEXSPI_1PAD, 0x06, STOP, FLEXSPI_1PAD, 0x00), // Set Read register [4*CMD_LUT_SEQ_IDX_SET_READ_PARAM] = FLEXSPI_LUT_SEQ(CMD_SDR,FLEXSPI_1PAD, 0x63, WRITE_SDR, FLEXSPI_1PAD, 0x01), [4*CMD_LUT_SEQ_IDX_SET_READ_PARAM + 1] = FLEXSPI_LUT_SEQ(STOP,FLEXSPI_1PAD, 0x00, 0, 0, 0), }, }, .pageSize = 256u, .sectorSize = 4u * 1024u, .blockSize = 64u * 1024u, .isUniformBlockSize = false, }; Is DDR mode real better? According to the RT1060's datasheet, the below table illustrates the maximum frequency of FlexSPI operation, as the MIMXRT1060's onboard QSPI flash is IS25WP064AJBLE, it doesn't contain the MQS pin, it means set MCR0.RXCLKsrc=1 (Internal dummy read strobe and loopbacked from DQS) is the most optimized option. operation mode RXCLKsrc=0 RXCLKsrc=1 RXCLKsrc=3 SDR 60 MHz 133 MHz 166 MHz DDR 30 MHz 66 MHz 166 MHz In another word, QSPI can run up to 133 MHz in SDR mode versus 66 MHz in DDR mode. From the perspective of throughput capacity, they're almost the same. It seems like DDR mode is not a better option for IS25WP064AJBLE and the following experiment will validate the assumption. Experiment mbedtls_benchmark I use the mbedtls_benchmark as the first testing demo and I run the demo under the below conditions: 100MH, SDR mode; 133MHz, SDR mode; 66MHz, DDR mode; According to the corresponding printout information (as below shows), I make a table for comparison and I mark the worst performance of implementation items among the above three conditions, just as Fig 4 shows. SDR Mode run at 100 MHz. FlexSPI clock source is 3, FlexSPI Div is 6, PllPfd2Clk is 720000000 mbedTLS version 2.16.6 fsys=600000000 Using following implementations: SHA: DCP HW accelerated AES: DCP HW accelerated AES GCM: Software implementation DES: Software implementation Asymmetric cryptography: Software implementation MD5 : 18139.63 KB/s, 27.10 cycles/byte SHA-1 : 44495.64 KB/s, 12.52 cycles/byte SHA-256 : 47766.54 KB/s, 11.61 cycles/byte SHA-512 : 2190.11 KB/s, 267.88 cycles/byte 3DES : 1263.01 KB/s, 462.49 cycles/byte DES : 2962.18 KB/s, 196.33 cycles/byte AES-CBC-128 : 52883.94 KB/s, 10.45 cycles/byte AES-GCM-128 : 1755.38 KB/s, 329.33 cycles/byte AES-CCM-128 : 2081.99 KB/s, 279.72 cycles/byte CTR_DRBG (NOPR) : 5897.16 KB/s, 98.15 cycles/byte CTR_DRBG (PR) : 4489.58 KB/s, 129.72 cycles/byte HMAC_DRBG SHA-1 (NOPR) : 1297.53 KB/s, 448.03 cycles/byte HMAC_DRBG SHA-1 (PR) : 1205.51 KB/s, 486.04 cycles/byte HMAC_DRBG SHA-256 (NOPR) : 1786.18 KB/s, 327.70 cycles/byte HMAC_DRBG SHA-256 (PR) : 1779.52 KB/s, 328.93 cycles/byte RSA-1024 : 202.33 public/s RSA-1024 : 7.00 private/s DHE-2048 : 0.40 handshake/s DH-2048 : 0.40 handshake/s ECDSA-secp256r1 : 9.00 sign/s ECDSA-secp256r1 : 4.67 verify/s ECDHE-secp256r1 : 5.00 handshake/s ECDH-secp256r1 : 9.33 handshake/s   DDR Mode run at 66 MHz. FlexSPI clock source is 2, FlexSPI Div is 5, PllPfd2Clk is 396000000 mbedTLS version 2.16.6 fsys=600000000 Using following implementations: SHA: DCP HW accelerated AES: DCP HW accelerated AES GCM: Software implementation DES: Software implementation Asymmetric cryptography: Software implementation MD5 : 16047.13 KB/s, 27.12 cycles/byte SHA-1 : 44504.08 KB/s, 12.54 cycles/byte SHA-256 : 47742.88 KB/s, 11.62 cycles/byte SHA-512 : 2187.57 KB/s, 267.18 cycles/byte 3DES : 1262.66 KB/s, 462.59 cycles/byte DES : 2786.81 KB/s, 196.44 cycles/byte AES-CBC-128 : 52807.92 KB/s, 10.47 cycles/byte AES-GCM-128 : 1311.15 KB/s, 446.53 cycles/byte AES-CCM-128 : 2088.84 KB/s, 281.08 cycles/byte CTR_DRBG (NOPR) : 5966.92 KB/s, 97.55 cycles/byte CTR_DRBG (PR) : 4413.15 KB/s, 130.42 cycles/byte HMAC_DRBG SHA-1 (NOPR) : 1291.64 KB/s, 449.47 cycles/byte HMAC_DRBG SHA-1 (PR) : 1202.41 KB/s, 487.05 cycles/byte HMAC_DRBG SHA-256 (NOPR) : 1748.38 KB/s, 328.16 cycles/byte HMAC_DRBG SHA-256 (PR) : 1691.74 KB/s, 329.78 cycles/byte RSA-1024 : 201.67 public/s RSA-1024 : 7.00 private/s DHE-2048 : 0.40 handshake/s DH-2048 : 0.40 handshake/s ECDSA-secp256r1 : 8.67 sign/s ECDSA-secp256r1 : 4.67 verify/s ECDHE-secp256r1 : 4.67 handshake/s ECDH-secp256r1 : 9.00 handshake/s   Fig 4 Performance comparison We can find that most of the implementation items are achieve the worst performance when QSPI works in DDR mode with 66 MHz. Coremark demo The second demo is running the Coremark demo under the above three conditions and the result is illustrated below. SDR Mode run at 100 MHz. FlexSPI clock source is 3, FlexSPI Div is 6, PLL3 PFD0 is 720000000 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391889200 Total time (secs): 16.328717 Iterations/Sec : 2449.671999 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.671999 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   SDR Mode run at 133 MHz. FlexSPI clock source is 3, FlexSPI Div is 4, PLL3 PFD0 is 664615368 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391888682 Total time (secs): 16.328695 Iterations/Sec : 2449.675237 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.675237 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   DDR Mode run at 66 MHz. FlexSPI clock source is 2, FlexSPI Div is 5, PLL3 PFD0 is 396000000 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391890772 Total time (secs): 16.328782 Iterations/Sec : 2449.662173 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.662173 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   After comparing the CoreMark scores, it gets the lowest CoreMark score when QSPI works in DDR mode with 66 MHz. However, they're actually pretty close. Through the above two testings, we can get the DDR mode maybe not a better option, at least for the i.MX RT10xx series MCU.
View full article
Introduction NXP i.MXRT106x has two USB2.0 OTG instance. And the RT1060 EVK has both of the USB interface on the board. But the RT1060 SDK only has single USB host example. Although RT1060’s USB host stack support multiple devices, but we still need a USB HUB when user want to connect two device. This article will show you how to make both USB instance as host. RT1060 SDK has single host examples which support multiple devices, like host_hid_mouse_keyboard_bm. But this application don’t use these examples. Instead, MCUXpresso Config Tools is used to build the demo from beginning. The config tool is a very powerful tool which can configure clock, pin and peripherals, especially the USB. In this application demo, it can save 95% coding work. Hardware and software tools RT1060 EVK MCUXpresso 11.4.0 MIMXRT1060 SDK 2.9.1 Step 1 This project will support USB HID mouse and USB CDC. First, create an empty project named MIMXRT1062_usb_host_dual_port. When select SDK components, select “USB host CDC” and “”USB host HID” in Middleware label. IDE will select other necessary component automatically.     After creating the empty project, clock should be configured first. Both of the USB PHY need 480M clock.   Step 2 Next step is to configure USB host in peripheral config tool. Due to the limitation of config tool, only one host instance of the USB component is allowed. In this project, CDC VCOM is added first.   Step 3 After these settings, click “Update Code” in control bar. This will turn all the configurations into code and merge into project. Then click the “copy to clipboard” button. This will copy the host task call function. Paste it in the forever while loop in the project’s main(). Besides that, it also need to add BOARD_InitBootPeripherals() function call in main(). At this point, USB VCOM is ready. The tool will not only copy the file and configure USB, but also create basic implementation framework. If compile and download the project to RT1060 EVK, it can enumerate a USB CDC VCOM device on USB1. If characters are send from CDC device, the project can send it out to DAPLink UART port so that you can see the character on a terminal interface in computer. Step 4 To get USB HID mouse code, it need to create another USB HID project. The workflow is similar to the first project. Here is the screenshot of the USB HID configuration.   Click “Update code”, the HID mouse code will be generated. The config tool generate two files, usb_host_interface_0_hid_mouse.c and usb_host_interface_0_hid_mouse. Copy them to the “source” folder in dual host project.     Step 5 Next step is to modify some USB macro definitions. <usb_host_config.h> #define USB_HOST_CONFIG_EHCI 2 /*means there are two host instance*/ #define USB_HOST_CONFIG_MAX_HOST 2 /*The USB driver can support two ehci*/ #define USB_HOST_CONFIG_HID (1U) /*for mouse*/ Next step is merge usb_host_app.c. The project initialize USB hardware and software in USB_HostApplicationInit(). usb_status_t USB_HostApplicationInit(void) { usb_status_t status; USB_HostClockInit(kUSB_ControllerEhci0); USB_HostClockInit(kUSB_ControllerEhci1); #if ((defined FSL_FEATURE_SOC_SYSMPU_COUNT) && (FSL_FEATURE_SOC_SYSMPU_COUNT)) SYSMPU_Enable(SYSMPU, 0); #endif /* FSL_FEATURE_SOC_SYSMPU_COUNT */ status = USB_HostInit(kUSB_ControllerEhci0, &g_HostHandle[0], USB_HostEvent); status = USB_HostInit(kUSB_ControllerEhci1, &g_HostHandle[1], USB_HostEvent); /*each usb instance have a g_HostHandle*/ if (status != kStatus_USB_Success) { return status; } else { USB_HostInterface0CicVcomInit(); USB_HostInterface0HidMouseInit(); } USB_HostIsrEnable(); return status; } In USB_HostIsrEnable(), add code to enable USB2 interrupt.   irqNumber = usbHOSTEhciIrq[1]; NVIC_SetPriority((IRQn_Type)irqNumber, USB_HOST_INTERRUPT_PRIORITY); EnableIRQ((IRQn_Type)irqNumber); Then add and modify USB interrupt handler. void USB_OTG1_IRQHandler(void) { USB_HostEhciIsrFunction(g_HostHandle[0]); } void USB_OTG2_IRQHandler(void) { USB_HostEhciIsrFunction(g_HostHandle[1]); } Since both USB instance share the USB stack, When USB event come, all the event will call USB_HostEvent() in usb_host_app.c. HID code should also be merged into this function. static usb_status_t USB_HostEvent(usb_device_handle deviceHandle, usb_host_configuration_handle configurationHandle, uint32_t eventCode) { usb_status_t status1; usb_status_t status2; usb_status_t status = kStatus_USB_Success; /* Used to prevent from multiple processing of one interface; * e.g. when class/subclass/protocol is the same then one interface on a device is processed only by one interface on host */ uint8_t processedInterfaces[USB_HOST_CONFIG_CONFIGURATION_MAX_INTERFACE] = {0}; switch (eventCode & 0x0000FFFFU) { case kUSB_HostEventAttach: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 == kStatus_USB_NotSupported) && (status2 == kStatus_USB_NotSupported)) { status = kStatus_USB_NotSupported; } break; case kUSB_HostEventNotSupported: usb_echo("Device not supported.\r\n"); break; case kUSB_HostEventEnumerationDone: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 != kStatus_USB_Success) && (status2 != kStatus_USB_Success)) { status = kStatus_USB_Error; } break; case kUSB_HostEventDetach: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 != kStatus_USB_Success) && (status2 != kStatus_USB_Success)) { status = kStatus_USB_Error; } break; case kUSB_HostEventEnumerationFail: usb_echo("Enumeration failed\r\n"); break; default: break; } return status; } USB_HostTasks() is used to deal with all the USB messages in the main loop. At last, HID work should also be added in this function. void USB_HostTasks(void) { USB_HostTaskFn(g_HostHandle[0]); USB_HostTaskFn(g_HostHandle[1]); USB_HostInterface0CicVcomTask(); USB_HostInterface0HidMouseTask(); }   After all these steps, the dual USB function is ready. User can insert USB mouse and USB CDC device into any of the two USB port simultaneously. Conclusion All the RT/LPC/Kinetis devices with two OTG or HOST can support dual USB host. With the help of MCUXpresso Config Tool, it is easy to implement this function.
View full article
Overview ======== The LPUART example for FreeRTOS demonstrates the possibility to use the LPUART driver in the RTOS with hardware flow control. The example uses two instances of LPUART IP and sends data between them. The UART signals must be jumpered together on the board. Toolchain supported =================== - MCUXpresso 11.0.0 Hardware requirements ===================== - Mini/micro USB cable - MIMXRT1050-EVKB board - Personal Computer Board settings ============== R278 and R279 must be populated, or have pads shorted. These resistors are under the display opposite side of board from uSD connector. The following pins need to be jumpered together: --------------------------------------------------------------------------------- | | UART3 (UARTA) | UART8 (UARTB) | |---|-------------------------------------|-------------------------------------| | # | Signal | Function | Jumper | Jumper | Function | Signal | |---|---------------|----------|----------|----------|----------|---------------| | 1 | GPIO_AD_B1_07 | RX | J22-pin1 | J23-pin1 | TX | GPIO_AD_B1_10 | | 2 | GPIO_AD_B1_06 | TX | J22-pin2 | J23-pin2 | RX | GPIO_AD_B1_11 | | 3 | GPIO_AD_B1_04 | CTS | J23-pin3 | J24-pin5 | RTS | GPIO_SD_B0_03 | | 4 | GPIO_AD_B1_05 | RTS | J23-pin4 | J24-pin4 | CTS | GPIO_SD_B0_02 | --------------------------------------------------------------------------------- Prepare the Demo ================ 1. Connect a USB cable between the host PC and the OpenSDA USB port on the target board. 2. Open a serial terminal with the following settings: - 115200 baud rate - 8 data bits - No parity - One stop bit - No flow control 3. Download the program to the target board. 4. Either press the reset button on your board or launch the debugger in your IDE to begin running the demo. Running the demo ================ You will see status of the example printed to the console. Customization options =====================
View full article
RT10xx SAI basic and SDCard wave file play 1. Introduction NXP RT10xx's audio modules are SAI, SPDIF, and MQS. The SAI module is a synchronous serial interface for audio data transmission. SPDIF is a stereo transceiver that can receive and send digital audio, MQS is used to convert I2S audio data from SAI3 to PWM, and then can drive external speakers, but in practical usage, it still need to add the amplifier drive circuit. When we use the SAI module, it will be related to the audio file play and the data obtained. This article will be based on the MIMXRT1060-EVK board, give the RT10xx SAI module basic knowledge, PCM waveform format, the audio file cut, and conversion tool, use the MCUXpresso IDE CFG peripheral tool to create the SAI project, play the audio data, it will also provide the SDcard with fatfs system to read the wave file and play it. 2. Basic Knowledge and the tools Before entering the project details and testing, just provide some SAI module knowledge, wave file format information, audio convert tools. 2.1 SAI module basic RT10xx SAI module can support I2S, AC97, TDM, and codec/DSP interface. SAI module contains Transmitter and Receiver, the related signals:     SAI_MCLK: master clock, used to generate the bit clock, master output, slave input.     SAI_TX_BCLK: Transmit bit clock, master output, slave input     SAI_TX_SYNC: Transmit Frame sync, master output, slave input, L/R channel select     SAI_TX_DATA[4]:Transmit data line, 1-3 share with RX_DATA[1-3]     SAI_RX_BCLK: receiver bit clock     SAI_RX_SYNC: receiver frame sync     SAI_RX_DATA[4]: receiver data line SAI module clocks: audio master clock, bus clock, bit clock SAI module Frame sync has 3 modes:      1)Transmit and receive using its own BCLK and SYNC      2)Transmit async, receive sync: use transmit BCLK and SYNC, transmit enable at first, disable at last.      3)Transmit sync, receive async: use receive BCLK and SYNC, receiver enable at first, disable at last. Valid frame sync is also ignored (slave mode) or not generated (master mode) for the first four-bit clock cycles after enabling the transmitter or receiver. Pic 1 SAI module clock structure: Pic 2 SAI module 3 clock sources:  PLL3_PFD3, PLL5, PLL4 In the above picture, SAI1_CLK_ROOT, which can be used as the MCLK, the BCLK is: BCLK= master clock/(TCR2[DIV]+1)*2 Sample rate = Bitclockfreq /(bitwidth*channel) 2.2 waveform audio file format WAVE file is used to save the PCM encode data, WAVE is using the RIFF format, the smallest unit in the RIFF file is the CK struct, CKID is the data type, the value can be: “RIFF”,“LIST”,“fmt”, “data” etc. RIFF file is little-endian. RIFF structure: typedef unsigned long DWORD;//4B typedef unsigned char BYTE;//1B typedef DWORD FOURCC; // 4B typedef struct { FOURCC ckID; //4B DWORD ckSize; //4B union { FOURCC fccType; // RIFF form type 4B BYTE ckData[ckSize]; //ckSize*1B } ckData; } RIFFCK; Pic 3 Take a 16Khz 2 channel wave file as the example: Pic 4 Yellow: CKID  Green: data length   Purple: data The detailed analysis as follows: Pic 5 We can find, the real audio data, except the wave header, the data size is 1279860bytes. 2.3 Audio file convert In practical usage, the audio file may not the required channel and the sample rate configuration, or the format is not the wave, or the time is too long, then we can use some tool to convert it to your desired format. We can use the ffmpeg tool: https://ffmpeg.org/ About the details, check the ffmpeg document, normally we use these command: mp3 file converts to 16k, 16bit, 2 channel wave file: ffmpeg -i test.mp3 -acodec pcm_s16le -ar 16000 -ac 2 test.wav or: ffmpeg -i test.mp3 -aq 16 -ar 16000 -ac 2 test.wav test.wav, cut 35s from 00:00:00, and can convert save to test1.wav: ffmpeg -ss 00:00:00 -i test.wav -t 35.0 -c copy test1.wav Pic 6 Pic 7 2.4 Obtain wave L/R channel audio data Just like the SDK code, save the L/R audio data directly in the RT RAM array, so here, we need to obtain the audio data from the wav file. We can use the python readout the wav header, then get the audio data size, and save the audio data to one array in the .h files. The related Python code can be: import sys import wave def wav2hex(strWav, strHex): with wave.open(strWav, "rb") as fWav: wavChannels = fWav.getnchannels() wavSampleWidth = fWav.getsampwidth() wavFrameRate = fWav.getframerate() wavFrameNum = fWav.getnframes() wavFrames = fWav.readframes(wavFrameNum) wavDuration = wavFrameNum / wavFrameRate wafFramebytes = wavFrameNum * wavChannels * wavSampleWidth print("Channels: {}".format(wavChannels)) print("Sample width: {}bits".format(wavSampleWidth * 8)) print("Sample rate: {}kHz".format(wavFrameRate/1000)) print("Frames number: {}".format(wavFrameNum)) print("Duration: {}s".format(wavDuration)) print("Frames bytes: {}".format(wafFramebytes)) fWav.close() pass with open(strHex, "w") as fHex: # Print WAV parameters fHex.write("/*\n"); fHex.write(" Channels: {}\n".format(wavChannels)) fHex.write(" Sample width: {}bits\n".format(wavSampleWidth * 8)) fHex.write(" Sample rate: {}kHz\n".format(wavFrameRate/1000)) fHex.write(" Frames number: {}\n".format(wavFrameNum)) fHex.write(" Duration: {}s\n".format(wavDuration)) fHex.write(" Frames bytes: {}\n".format(wafFramebytes)) fHex.write("*/\n\n") # Print WAV frames fHex.write("uint8_t music[] = {\n") print("Transferring...") i = 0 while wafFramebytes > 0: if(wafFramebytes < 16): BytesToPrint = wafFramebytes else: BytesToPrint = 16 fHex.write(" ") for j in range(0, BytesToPrint): if j != 0: fHex.write(' ') fHex.write("0x{:0>2x},".format(wavFrames[i])) i+=1 j+=1 fHex.write("\n") wafFramebytes -= BytesToPrint fHex.write("};\n") fHex.close() print("Done!") wav2hex(sys.argv[1], sys.argv[2]) Take the music1.wave as an example: Pic 8 2.4 Audio data relationship with audio wave 16bit data range is: -32768 to 32767, the goldwave related value range is(-1~1).Use goldwave tool to open the example music1.wav, check the data in 1s position, the left channel relative data is -0.08227, right channel relative data is -0.2257. Pic 9                                                                          pic 10 Now, calculate the L/R real data, and find the position in the music1.h. Pic 11 From pic 8, we can know, the real wave R/L data from line 11, each line contains 16 bytes of data. So, from music1.wav related value, we can calculate the related data, and compare it with the real data in the array, we can find, it is totally the same. 3. SAI MCUXpresso project creation Based on SDK_2.9.2_EVK-MIMXRT1060, create one SAI DMA audio play project. The audio data can use the above music1.h. Create one bare-metal project: Drivers check: clock, common, dmamux, edma,gpio,i2c,iomuxc,lpuart,sai,sai_edma,xip_device Utilities check:       Debug_console,lpuart_adapter,serial_manager,serial_manager_uart Board components check:       Xip_board Abstraction Layer check:       Codec, codec_wm8960_adapter,lpi2c_adapter Software Components check:       Codec_i2c,lists,wm8960 After the creation of the project, open the clocks, configure the clock, core, flexSPI can use the default one, we mainly configure the SAI1 related clocks: Pic 12 Select the SAI1 clock source as PLL4, PLL4_MAIN_CLK configure as 786.48MHz. SAI1 clock configure as 6.144375MHz. After the configuration, update the code. Open Pins tool, configure the SAI1 related pins, as the codec also need the I2C, so it contains the I2C pin configuration. Pic 13 Update the code. Open peripherals, configure DMA, SAI, NVIC. Pic 14 Pic 15 DMA配置如下: pic16 After configuration, generate the code. In the above configuration, we have finished the SAI DMA transfer configuration, SAI master mode, 16bits, the sample rate is 16kHz, 2channel, DMA transfer, bit clock is 512Khz, the master clock is 6.1443Mhz. void callback(I2S_Type *base, sai_edma_handle_t *handle, status_t status, void *userData) { if (kStatus_SAI_RxError == status) { } else { finishIndex++; emptyBlock++; /* Judge whether the music array is completely transfered. */ if (MUSIC_LEN / BUFFER_SIZE == finishIndex) { isFinished = true; finishIndex = 0; emptyBlock = BUFFER_NUM; tx_index = 0; cpy_index = 0; } } } int main(void) { sai_transfer_t xfer; /* Init board hardware. */ BOARD_ConfigMPU(); BOARD_InitBootPins(); BOARD_InitBootClocks(); BOARD_InitBootPeripherals(); #ifndef BOARD_INIT_DEBUG_CONSOLE_PERIPHERAL /* Init FSL debug console. */ BOARD_InitDebugConsole(); #endif PRINTF(" SAI wav module test!\n\r"); /* Use default setting to init codec */ if (CODEC_Init(&codecHandle, &boardCodecConfig) != kStatus_Success) { assert(false); } /* delay for codec output stable */ DelayMS(DEMO_CODEC_INIT_DELAY_MS); CODEC_SetVolume(&codecHandle,2U,50); // set 50% volume EnableIRQ(DEMO_SAI_IRQ); SAI_TxEnableInterrupts(DEMO_SAI, kSAI_FIFOErrorInterruptEnable); PRINTF(" MUSIC PLAY Start!\n\r"); while (1) { PRINTF(" MUSIC PLAY Again\n\r"); isFinished = false; while (!isFinished) { if ((emptyBlock > 0U) && (cpy_index < MUSIC_LEN / BUFFER_SIZE)) { /* Fill in the buffers. */ memcpy((uint8_t *)&buffer[BUFFER_SIZE * (cpy_index % BUFFER_NUM)], (uint8_t *)&music[cpy_index * BUFFER_SIZE], sizeof(uint8_t) * BUFFER_SIZE); emptyBlock--; cpy_index++; } if (emptyBlock < BUFFER_NUM) { /* xfer structure */ xfer.data = (uint8_t *)&buffer[BUFFER_SIZE * (tx_index % BUFFER_NUM)]; xfer.dataSize = BUFFER_SIZE; /* Wait for available queue. */ if (kStatus_Success == SAI_TransferSendEDMA(DEMO_SAI, &SAI1_SAI_Tx_eDMA_Handle, &xfer)) { tx_index++; } } } } }   4. SAI test result     To check the real L/R data sendout situation, we modify the music array first 16 bytes data as: 0x55,0xaa,0x01,0x00,0x02,0x00,0x03,0x00,0x04,0x00,0x05,0x00,0x06,0x00,0x07,0x00 Then test SAI_MCLK,SAI_TX_BCLK,SAI_TX_SYNC,SAI_TXD pin wave, and compare with the defined data, because the polarity is configured as active low, it is falling edge output, sample at rising edge. The test point on the MIMXRT1060-EVK board is using the codec pin position: Pic 17 4.1 Logic Analyzer tool wave Pic 18 MCLK clock frequency is 6.144375Mhz, BCLK is 512KHz, SYNC is 16KHz. Pic 19 The first frame data is:1010101001010101 0000000000000001 0XAA55  0X0001 It is the same as the array defined L/R data. SYNC low is Left 16 bit, High is right 16 bit. 4.2 Oscilloscope test wave Just like the logic analyzer, the oscilloscope wave is the same: Pic 20 Add the music.h to the project, and let the main code play the music array data in loop, we will hear the music clear when insert the headphone to on board J12 or add a speaker. 5. SAI SDcard wave music play This part will add the sd card, fatfs system, to read out the 16bit 16K 2ch wave file in the sd card, and play it in loop. 5.1 driver add     Code is based on SDK_2.9.2_EVK-MIMXRT1060, just on the previous project, add the sdcard, sd fatfs driver, now the bare-metal driver situation is: Drivers check: cache, clock, common, dmamux, edma,gpio,i2c,iomuxc,lpuart,sai,sai_edma,sdhc, xip_device Utilities check:       Debug_console,lpuart_adapter,serial_manager,serial_manager_uart Middleware check:       File System->FAT File System->fatfs+sd, Memories Board components check:       Xip_board Abstraction Layer check:       Codec, codec_wm8960_adapter,lpi2c_adapter Software Components check:       Codec_i2c,lists,wm8960 5.2 WAVE header analyzer with code    From previous content, we can know the wav header structure, we need to play the wave file from the sd card, then we need to analyze the wave header to get the audio format, audio data-related information. The header analysis code is: uint8_t Fun_Wave_Header_Analyzer(void) { char * datap; uint8_t ErrFlag = 0; datap = strstr((char*)Wav_HDBuffer,"RIFF"); if(datap != NULL) { wav_header.chunk_size = ((uint32_t)*(Wav_HDBuffer+4)) + (((uint32_t)*(Wav_HDBuffer + 5)) << + (((uint32_t)*(Wav_HDBuffer + 6)) << 16) +(((uint32_t)*(Wav_HDBuffer + 7)) << 24); movecnt += 8; } else { ErrFlag = 1; return ErrFlag; } datap = strstr((char*)(Wav_HDBuffer+movecnt),"WAVEfmt"); if(datap != NULL) { movecnt += 8; wav_header.fmtchunk_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); wav_header.audio_format = ((uint16_t)*(Wav_HDBuffer+movecnt+4) + (uint16_t)*(Wav_HDBuffer+movecnt+5)); wav_header.num_channels = ((uint16_t)*(Wav_HDBuffer+movecnt+6) + (uint16_t)*(Wav_HDBuffer+movecnt+7)); wav_header.sample_rate = ((uint32_t)*(Wav_HDBuffer+movecnt+8)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 9)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 10)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 11)) << 24); wav_header.byte_rate = ((uint32_t)*(Wav_HDBuffer+movecnt+12)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 13)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 14)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 15)) << 24); wav_header.block_align = ((uint16_t)*(Wav_HDBuffer+movecnt+16) + (uint16_t)*(Wav_HDBuffer+movecnt+17)); wav_header.bps = ((uint16_t)*(Wav_HDBuffer+movecnt+18) + (uint16_t)*(Wav_HDBuffer+movecnt+19)); movecnt +=(4+wav_header.fmtchunk_size); } else { ErrFlag = 1; return ErrFlag; } datap = strstr((char*)(Wav_HDBuffer+movecnt),"LIST"); if(datap != NULL) { movecnt += 4; wav_header.list_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); movecnt +=(4+wav_header.list_size); } //LIST not Must datap = strstr((char*)(Wav_HDBuffer+movecnt),"data"); if(datap != NULL) { movecnt += 4; wav_header.datachunk_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); movecnt += 4; ErrFlag = 0; } else { ErrFlag = 1; return ErrFlag; } PRINTF("Wave audio format is %d\r\n",wav_header.audio_format); PRINTF("Wave audio channel number is %d\r\n",wav_header.num_channels); PRINTF("Wave audio sample rate is %d\r\n",wav_header.sample_rate); PRINTF("Wave audio byte rate is %d\r\n",wav_header.byte_rate); PRINTF("Wave audio block align is %d\r\n",wav_header.block_align); PRINTF("Wave audio bit per sample is %d\r\n",wav_header.bps); PRINTF("Wave audio data size is %d\r\n",wav_header.datachunk_size); return ErrFlag; } Mainly divide RIFF to 4 parts: “RIFF”,“fmt”,“LIST”,“data”. The 4 bytes data follows the “data” is the whole audio data size, it can be used to the fatfs to read the audio data. The above code also recodes the data position, then when using the fatfs read the wave, we can jump to the data area directly. 5.3 SD card wave data play     Define the array audioBuff[4* 512], used to read out the sd card wave file, and use these data send to the SAI EDMA and transfer it to the I2S interface until all the data is transmitted to the I2S interface.     Callback record each 512 bytes data send out finished, and judge the transmit data size is reached the whole wave audio data size. 5.4 sd card wave play result    Prepare one wave file, 16bit 16k sample rate, 2 channel file, named as music.wav, put in the sd card which already does the fat32 format, insert it to the MIMXRT1060-EVK J39, run the code, will get the printf information: Please insert a card into the board. Card inserted. Make file system......The time may be long if the card capacity is big. SAI wav module test! MUSIC PLAY Start! Wave audio format is 1 Wave audio channel number is 2 Wave audio sample rate is 16000 Wave audio byte rate is 64000 Wave audio block align is 4 Wave audio bit per sample is 16 Wave audio data size is 2728440 Playback is begin! Playback is finished! At the same time, after inserting the headphone or the speaker into the J12, we can hear the music. Attachment is the mcuxpresso10.3.0 and the wave samples.  
View full article
Source code: https://github.com/JayHeng/NXP-MCUBootUtility 【v1.3.0】 Features: > 1. Can generate .sb file by actions in efuse operation utility window >    支持生成仅含自定义efuse烧写操作(在efuse operation windows里指定)的.sb格式文件 Improvements: > 1. HAB signed mode should not appliable for FlexSPI/SEMC NOR device Non-XIP boot with RT1020/1015 ROM >    HAB签名模式在i.MXRT1020/1015下应不支持从FlexSPI NOR/SEMC NOR启动设备中Non-XIP启动 > 2. HAB encrypted mode should not appliable for FlexSPI/SEMC NOR device boot with RT1020/1015 ROM >    HAB加密模式在i.MXRT1020/1015下应不支持从FlexSPI NOR/SEMC NOR启动设备中启动 > 3. Multiple .sb files(all, flash, efuse) should be generated if there is efuse operation in all-in-one action >    当All-In-One操作中包含efuse烧写操作时,会生成3个.sb文件(全部操作、仅flash操作、仅efuse操作) > 4. Can generate .sb file without board connection when boot device type is NOR >    当启动设备是NOR型Flash时,可以不用连接板子直接生成.sb文件 > 5. Automatic image readback can be disabled to save operation time >    一键操作下的自动程序回读可以被禁掉,用以节省操作时间 > 6. The text of language option in menu bar should be static and easy understanding >    菜单栏里的语言选项标签应该是静态且易于理解的(中英双语同时显示) Bugfixes: > 1. Cannot generate bootable image when original image (hex/bin) size is larger than 64KB >    当输入的源image文件格式为hex或者bin且其大小超过64KB时,生成可启动程序会失败 > 2. Cannot download large image file (eg 6.8MB) in some case >    当输入的源image文件非常大时(比如6.8MB),下载可能会超时失败 > 3. There is language switch issue with some dynamic labels >    当切换显示语言时,有一些控件标签(如Connect按钮)不能实时更新 > 4. Some led demos of RT1050 EVKB board are invalid >    /apps目录下RT1050 EVKB板子的一些LED demo是无效的 【v1.4.0】 Features: > 1. Support for loading bootable image into uSDHC SD/eMMC boot device >    支持下载Bootable image进主动启动设备 - uSDHC接口SD/eMMC卡 > 2. Provide friendly way to view and set mixed eFuse fields >    支持更直观友好的方式去查看/设置某些混合功能的eFuse区域 Improvements: > 1. Set default FlexSPI NOR device to align with NXP EVK boards >    默认FlexSPI NOR device应与恩智浦官方EVK板卡相匹配 > 2. Enable real-time gauge for Flash Programmer actions >    为通用Flash编程器里的操作添加实时进度条显示
View full article
RT106L_S voice control system based on the Baidu cloud 1 Introduction     The NXP RT106L and RT106S are voice recognition chip which is used for offline local voice control, SLN-LOCAL-IOT is based on RT106L, SLN-LOCAL2-IOT is a new local speech recognition board based on RT106S. The board includes the murata 1DX wifi/BLE module, the AFE voice analog front end, the ASR recognition system, the external flash, 2 microphones, and the analog voice amplifier and speakers. The voice recognition process for SLN-LOCAL-IOT and SLN-LOCAL2-IOT is different and the new SLN-LOCAL2-IOT is recommended.     This article is based on the voice control board SLN-LOCAL/2-IOT to implement the following block diagram functions: Pic 1 Use the PC-side speed model tool (Cyberon DSMT) to generate WW(wake word) and VC(voice command) Command related voice engine binary files , which will be used by the demo code. This system is mainly used for the Chinese word recognition, when the user says Chinese word: "小恩小恩", it wakes up SLN-LOCAL/2-IOT, and the board gives feedback "小恩来了,请吩咐". Then system enter the voice recognition stage, the user can say the voice recognition command: “开红灯”,“关红灯”,“开绿灯”,“关绿灯”,“灯闪烁”,“开远程灯”,“关远程灯”, after recognition, the board gives feedback "好的". Among them, “开红灯”,“关红灯”,“开绿灯”,“关绿灯”,“灯闪烁”,the five commands are used for the local light switch, while the 开远程灯”,“关远程灯“two commands can through network communication Baidu cloud control the additional MIMXRT1060-EVK development board light switch. SLN-LOCAL/2-IOT through the WIFI module access to the Internet with MQTT protocol to achieve communication with Baidu cloud, when dectect the remote control command, publish the json packets to Baidu cloud, while MIMRT1060-EVK subscribe Baidu cloud data, will receive data from the IOT board and analyze the EVK board led control. PC side can use MQTT.fx software to subscribe the Baidu cloud data, it also can send data to the device to achieve remote control function directly.  Now, will give the detail content about how to use the SLN-LOCAL/2-IOT SDK demo realize the customized Chinese wake command and voice command, and remote control the MIMXRT1060-EVK through the Baidu Cloud.     2 Platform establish 2.1 Used platform SLN-LOCAL-IOT/SLN-LOCAL2-IOT MIMXRT1060-EVK MQTT.fx SDK_2_8_0_SLN-LOCAL2-IOT MCUXPresso IDE Segger JLINK Baidu Smart Cloud: Baidu cloud control+ TTS Audacity:audio file format convert tool WAVToCode:wav convert to the c array code, which used for the demo tilte play MCUBootUtility: used to burn the feedback audio file to the filesystem Cyberon DSMT: wake word and voice detect command generation tool DSMT is the very important tool to realize the wake word and voice dection, the apply follow is: Pic 2 2.2 Baidu Smart cloud 2.2.1 Baidu cloud IOT control system Enter the IoT Hub: https://cloud.baidu.com/product/iot.html     Click used now. 2.2.1.1 Create device project Create a project, select the device type, and enter the project name. Device types can use shadows as images of devices in the cloud to see directly how data is changing. Once created, an endpoint is generated, along with the corresponding address: Pic 3 2.2.1.2 Create Thing model The Thing model is mainly to establish various properties needed in the shadow, such as temperature, humidity, other variables, and the type of value given, in fact, it is also the json item in the actual MQTT communication.    Click the newly created device-type project where you can create a new thing model or shadow: Pic 4    Here create 3 attributes:LEDstatus,humid,temp It is used to represent the led status, humidity, temperature and so on, which is convenient for communication and control between the cloud and RT board. Once created, you get the following picture:   Pic 5   2.2.1.3 Create Thing shadow In the device-type project, you can select the shadow, build your own shadow platform, enter the name, and select the object model as the newly created Thing model containing three properties, after the create, we can get the details of the shadow:   Pic 6 At the same time will also generate the shadow-related address, names and keys, my test platform situation is as follows: TCP Address: tcp://rndrjc9.mqtt.iot.gz.baidubce.com:1883 SSL Address: ssl://rndrjc9.mqtt.iot.gz.baidubce.com:1884 WSS Address: wss://rndrjc9.mqtt.iot.gz.baidubce.com:443 name: rndrjc9/RT1060BTCDShadow key: y92ewvgjz23nzhgn Port 1883, does not support transmission data encryption Port 1884, supports SSL/TLS encrypted transmission Port 8884, which supports wesockets-style connections, also contains SSL encryption. This article uses a 1883 port with no transmission data encryption for easy testing. So far, Baidu cloud device-type cloud shadow has been completed, the following can use MQTTfx tools to connect and test. In practice, it is recommended that customers build their own Baidu cloud connection, the above user key is for reference only.   2.2.2 Online TTS    SLN-LOCAL/2-IOT board recognizes wake-up words, recognition words, or when powering on, you need to add corresponding demo audio, such as: "百度云端语音测试demo ", "小恩来啦!请吩咐“,"好的". These words need to do a text-to-wav audio file synthesis, here is Baidu Smart Cloud's online TTS function, the specific operation can refer to the following documents: https://ai.baidu.com/ai-doc/SPEECH/jk38y8gno   Once the base audio library is opened, use the main.py provided in the link above and modify it to add the Chinese field you want to convert to the file "TEXT" and add the audio file to be converted in "save_file" such as xxx .wav, using the command: python main.py to complete the conversion, and generate the audio format corresponding to the text, such as .mp3, .wav. Pic 7   After getting the wav file, it can’t be used directly, we need to note that for SLN-LOCAL/2-IOT board, you need to identify the audio source of the 48K sample rate with 16bit, so we need to use the Audioacity Audio tool to convert the audio file format to 48K16bit wav. Import 16K16bit wav files generated by Baidu TTS into the Audioacity tool, select project rate of 48Khz, file->export->export as WAV, select encoding as signed 16bit PCM, and regenerate 48Khz16bit wav for use. Pic 8 “百度云端语音测试demo“:Used for power-on broadcasting, demo name broadcasting, it is stored in RT demo code, so you need to convert it to a 16bit C code array and add it to the project. "小恩来啦!请吩咐",“好的“:voice detect feedback, it is saved in the filesystem ZH01,ZH02 area. 2.3 playback audio data prepare and burn   There are two playback audio file, it is "小恩来啦!请吩咐",“好的“,it is saved in the filesystem ZH01,ZH02 area. Filesystem memory map like this: Pic 9 So, we need to convert the 48K16bit wav file to the filesystem needed format, we need to use the official tool::Ivaldi_sln_local2_iot Reference document:SLN-LOCAL2-IOT-DG chapter 10.1 Generating filesystem-compatible files Use bash input the commands like the following picture: Pic10 Use the convert command to get the playback bin file: python file_format.py -if xiaoencoming_48k16bit.wav -of xiaoencoming_48k16bit.bin -ft H At last, it will generate the file: "小恩来啦!请吩咐"->xiaoencoming_48k16bit.bin,burn to flash address 0x6184_0000 “好的”->OK_48k16bit.bin, burn to flash address 0x6180_0000 Then, use MCUBootUtility tool burn the above two file to the related images. Here, take OK_48k16bit.bin as an example, demo enter the serial download mode(J27-0), power off and power on. Flash chip need to select hyper flash IS26KSXXS, use the boot device memory windows, write button to burn the .bin file to the related address, length is 0X40000 Pic11 Pic12 xiaoencoming_48k16bit.bin can use the same method to download to 0x6184_0000,Length is 0X40000.   2.4 Demo audio prepare and add The prepared baiduclouddemo_48K16bit.wav(“百度云端语音测试demo “) need to convert to the 16bit C array code, and put to the project code, calls by the code, this is used for the demo mode play. The convert need to use the WAVToCode, the operation like this: Pic 13 The generated baiducloulddemo_48K16bit.c,add it to the demo project C files: sln_local_iot_local_demo->audio->demos->smart_home.c。 2.5 WW and VC prepare Wake-up word are generated through the cyberon DSMT tool, which supports a wide range of language, customers can request the tool through Figure 2. The Chinese wake-up words and voice command words in this article are also generated through DSMT. DSMT can have multiple groups, group1 as a wake-up word configuration, CmdMapID s 1. Other groups act as voice command words, such as CMD-IOT in this article, cmdMapID=2. Pic 14   Pic 15 Wake word continuously detects the input audio stream, uses group1, and if successfully wakes up, will do the voice command detection uses group2, or other identifying groups as well as custom groups. The wake-up words using the DSMT tool, the configuration are as follows: Pic 16 The WW can support more words, customer can add the needed one in the group 1. Use the DSMT configure VC like this: Pic 17 Then, save the file, code used file are: _witMapID.bin, CMD_IOT.xml,WW.xml. In the generated files, CYBase.mod is the base model, WW.mod is the WW model, CMD_IOT.mod is the VC model. After Pic 16,17, it finishes the WW and VC command prepare, we can put the DSMT project to the RT106S demo project folder: sln_local2_iot_local_demo\local_voice\oob_demo_zh 3 Code prepare Based on the official SLN-LOCAL2-IOT SDK local_demo, the code in this article modifies the Chinese wake-up words and recognition words (or you can build a new customer custom group directly), add local voice detect the led status operations, Then feedback Chinese audio, demo Chinese audio, Wifi network communication MQTT protocol code, and Baidu cloud shadow connection publish. Source reference code SDK path: SDK_2_8_0_SLN-LOCAL2-IOT\boards\sln_local2_iot\sln_voice_examples\local_demo   SDK_2_8_0_SLN-LOCAL2-IOT\boards\sln_local2_iot\sln_boot_apps SLN-LOCAL2-IOT and SLN-LOCAL-IOT code are nearly the same, the only difference is that the ASR library file is different, for RT106S (SLN-LOCAL2-IOT) using SDK it’s own libsln_asr.a library, for RT106L (SLN-LOCAL-IOT) need to use the corresponding libsln_asr_eval.a library.    Importing code requires three projects: local_demo, bootloader, bootstrap. The three projects store in different spaces. See SLN-LOCAL2-IOT-DG .pdf, chapter 3.3 Device memory map    This is the 3 chip project boot process: Pic 18 This document is for demo testing and requires debug, so this article turns off the encryption mechanism, configures bootloader, bootstrap engineering macro definition: DISABLE_IMAGE_VERIFICATION = 1, and uses JLINK to connect SLN-LOCAL/2-IOT's SWD interface to burn code. The following is to add modification code for app local_demo projects. 3.1 sln-local/2-iot code Sln-local-iot, sln-local2-iot platform, the following modification are the same for the two platform. 3.1.1 Voice recognition related code 1)Demo audio play Play content:“百度云端语音测试demo“ sln_local2_iot_local_demo_xe_ledwifi\audio\demos\ smart_home.c content is replaced by the previously generated baiducloulddemo_48K16bit.C. audio_samples.h,modify: #define SMART_HOME_DEMO_CLIP_SIZE 110733 This code is used for the main.c announce_demo API play:         case ASR_CMD_IOT:             ret = demo_play_clip((uint8_t *)smart_home_demo_clip, sizeof(smart_home_demo_clip));   2)command print information #define NUMBER_OF_IOT_CMDS      7 IndexCommands.h static char *cmd_iot_en[] = {"Red led on", "Red led off", "Green led on", "Green led off",                              "cycle led",        "remote led on",         "remote led off"}; static char *cmd_iot_zh[] = {"开红灯", "关红灯", "开绿灯", "关绿灯", "灯闪烁", "开远程灯", "关远程灯"}; Here is the source code modification using IOT, you can actually add your own speech recognition group directly, and add the relevant command identification.   3)sln_local_voice.c Line757 , add led-related notification information in ASR_CMD_IOT mode. oob_demo_control.ledCmd = g_asrControl.result.keywordID[1];     The code is used to obtain the recognized VC command data, and the value of keywordID[1] represents the number. This number can let the code know which detail voice is detected. so that you can do specific things in the app based on the value of ledcmd. The value of keywordID[1] corresponds to Command List in Figure 17. For example, “开远程灯“, if woke up, and recognized "开远程灯", then keywordID[1] is 5, and will transfer to oob_demo_control.ledCmd, which will be used in the appTask API to realize the detail control. 4) main.c void appTask(void *arg) Under case kCommandGeneric: if the language is Chinese, then add the recognition related control code, at first, it will play the feedback as “好的”. Then, it will check the voice detect value, give the related local led control. else if (oob_demo_control.language == ASR_CHINESE) { // play audio "OK" in Chinese #if defined(SLN_LOCAL2_RD) ret = audio_play_clip((uint8_t *)AUDIO_ZH_01_FILE_ADDR, AUDIO_ZH_01_FILE_SIZE); #elif defined(SLN_LOCAL2_IOT) ret = audio_play_clip(AUDIO_ZH_01_FILE); #endif //kerry add operation code==================================================begin RGB_LED_SetColor(LED_COLOR_OFF); if (oob_demo_control.ledCmd == LED_RED_ON) { RGB_LED_SetColor(LED_COLOR_RED); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_RED_OFF) { RGB_LED_SetColor(LED_COLOR_OFF); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_BLUE_ON) { RGB_LED_SetColor(LED_COLOR_BLUE); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_BLUE_OFF) { RGB_LED_SetColor(LED_COLOR_OFF); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == CYCLE_SLOW) { for (int i = 0; i < 3; i++) { RGB_LED_SetColor(LED_COLOR_RED); vTaskDelay(400); RGB_LED_SetColor(LED_COLOR_OFF); RGB_LED_SetColor(LED_COLOR_GREEN); vTaskDelay(400); RGB_LED_SetColor(LED_COLOR_OFF); RGB_LED_SetColor(LED_COLOR_BLUE); vTaskDelay(400); } } … } In addition to local voice recognition control, this article also add remote control functions, mainly through wifi connection, use the mqtt protocol to connect Baidu cloud server, when local speech recognition get the remote control command, it publish the corresponding control message to Baidu cloud, and then the cloud send the message to the client which subscribe this message,  after the client get the message, it will refer to the message content do the related control.   3.1.3 Network connection code 1)sln_local2_iot_local_demo_xe_ledwifi\lwip\src\apps\mqtt     Add mqtt.c 2)sln_local2_iot_local_demo_xe_ledwifi\lwip\src\include\lwip\apps Add mqtt.h, mqtt_opts.h,mqtt_prv.h The related mqtt driver is from the RT1060 sdk, which already added in the attachment project. 3)sln_tcp_server.c   Add MQTT application layer API function code, client ID, server host, MQTT server port number, user name, password, subscription topic, publishing topic and data, etc., more details, check the attachment code.    The MQTT application code is ported from the mqtt project of the RT1060 SDK and added to the sln_tcp_server.c. TCP_OTA_Server function is used to initialize the wifi network, realize wifi connection, connect to the network, resolve Baidu cloud server URL to get IP, and then connect Baidu cloud server through mqtt, after the successful connection, publish the message at first, so that after power-up through mqttfx to see whether the power on network publishing message is successful. TCP_OTA_Server function code is as follows: static void TCP_OTA_Server(void *param) //kerry consider add mqtt related code { err_t err = ERR_OK; uint8_t status = kCommon_Failed; #if USE_WIFI_CONNECTION /* Start the WiFi and connect to the network */ APP_NETWORK_Init(); while (status != kCommon_Success) { status_t statusConnect; statusConnect = APP_NETWORK_Wifi_Connect(true, true); if (WIFI_CONNECT_SUCCESS == statusConnect) { status = kCommon_Success; } else if (WIFI_CONNECT_NO_CRED == statusConnect) { APP_NETWORK_Uninit(); /* If there are no credential in flash delete the TPC server task */ vTaskDelete(NULL); } else { status = kCommon_Failed; } } #endif #if USE_ETHERNET_CONNECTION APP_NETWORK_Init(true); #endif /* Wait for wifi/eth to connect */ while (0 == get_connect_state()) { /* Give time to the network task to connect */ vTaskDelay(1000); } configPRINTF(("TCP server start\r\n")); configPRINTF(("MQTT connection start\r\n")); mqtt_client = mqtt_client_new(); if (mqtt_client == NULL) { configPRINTF(("mqtt_client_new() failed.\r\n");) while (1) { } } if (ipaddr_aton(EXAMPLE_MQTT_SERVER_HOST, &mqtt_addr) && IP_IS_V4(&mqtt_addr)) { /* Already an IP address */ err = ERR_OK; } else { /* Resolve MQTT broker's host name to an IP address */ configPRINTF(("Resolving \"%s\"...\r\n", EXAMPLE_MQTT_SERVER_HOST)); err = netconn_gethostbyname(EXAMPLE_MQTT_SERVER_HOST, &mqtt_addr); configPRINTF(("Resolving status: %d.\r\n", err)); } if (err == ERR_OK) { configPRINTF(("connect to mqtt\r\n")); /* Start connecting to MQTT broker from tcpip_thread */ err = tcpip_callback(connect_to_mqtt, NULL); configPRINTF(("connect status: %d.\r\n", err)); if (err != ERR_OK) { configPRINTF(("Failed to invoke broker connection on the tcpip_thread: %d.\r\n", err)); } } else { configPRINTF(("Failed to obtain IP address: %d.\r\n", err)); } int i=0; /* Publish some messages */ for (i = 0; i < 5;) { configPRINTF(("connect status enter: %d.\r\n", connected)); if (connected) { err = tcpip_callback(publish_message_start, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } i++; } sys_msleep(1000U); } vTaskDelete(NULL); } Please note the following published json data, it can’t be publish directly in the code. {   "reported": {     "LEDstatus": false,     "humid": 88,     "temp": 22   } } Which need to use this web https://www.bejson.com/ realize the json data compression and convert: {\"reported\" : {     \"LEDstatus\" : true,     \"humid\" : 88,     \"temp\" : 11    } }   4)main appTask Under case kCommandGeneric: , if the language is Chinese, then add the corresponding voice recognition control code. "开远程灯": turn on the local yellow light, publish the “remote led on” mqtt message to Baidu cloud, control remote 1060EVK board lights on. "关远程灯": turn on the local white light, publish the “remote led off” mqtt message to Baidu cloud, control the remote 1060EVK board light off. Related operation code: else if (oob_demo_control.ledCmd == LED_REMOTE_ON) { RGB_LED_SetColor(LED_COLOR_YELLOW); vTaskDelay(5000); err_t err = ERR_OK; err = tcpip_callback(publish_message_on, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } } else if (oob_demo_control.ledCmd == LED_REMOTE_OFF) { RGB_LED_SetColor(LED_COLOR_WHITE); vTaskDelay(5000); err_t err = ERR_OK; err = tcpip_callback(publish_message_off, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } } 3.2 MIMXRT1060-EVK code The main function of the MIMXRT1060-EVK code is to configure another client in the cloud, subscribe to the message published by SLN-LOCAL/2-IOT which detect the remote command, and then the LED on the control board is used to test the voice recognition remote control function, this code is based on Ethernet, through the Ethernet port on the board, to achieve network communication, and then use mqtt to connect baidu cloud, and subscribe the message from local2, This enables the reception and execution of the Local2 command. the network code part is similar to SLN-LOCAL2-IOT board network code, the servers, cloud account passwords, etc. are all the same, the main function is to subscribe messages. See the code from attachment RT1060, lwip_mqtt_freertos.c file. When receives data published by the server, it needs to do a data analysis to get the status of the led light and then control it. Normal data from Baidu cloud shadow sent as follows Received 253 bytes from the topic "$baidu/iot/shadow/RT1060BTCDShadow/update/accepted": "{"requestId":"2fc0ca29-63c0-4200-843f-e279e0f019d3","reported":{"LEDstatus":false,"humid":44,"temp":33},"desired":{},"lastUpdatedTime":{"reported":{"LEDstatus":1635240225296,"humid":1635240225296,"temp":1635240225296},"desired":{}},"profileVersion":159}" Then you need to parse the data of LEDstatus from the received data, whether it is false or true. Because the amount of data is small, there is no json-driven parsing here, just pure data parsing, adding the following parsing code to the mqtt_incoming_data_cb function: mqtt_rec_data.mqttindex = mqtt_rec_data.mqttindex + len; if(mqtt_rec_data.mqttindex >= 250) { PRINTF("kerry test \r\n"); PRINTF("idex= %d", mqtt_rec_data.mqttindex); datap = strstr((char*)mqtt_rec_data.mqttrecdata,"LEDstatus"); if(datap != NULL) { if(!strncmp(datap+11,strtrue,4))//char strtrue[]="true"; { GPIO_PinWrite(GPIO1, 3, 1U); //pull high PRINTF("\r\ntrue"); } else if(!strncmp(datap+11,strfalse,5))//char strfalse[]="false"; { GPIO_PinWrite(GPIO1, 3, 0U); //pull low PRINTF("\r\nfalse"); } } mqtt_rec_data.mqttindex =0; It use the strstr search the “LEDstatus“ in the received data, and get the pointer position, then add the fixed length to get the LED status is true or flash. If it is true, turn on the led, if it is false, turn off the led. 4 Test Result    This section gives the test results and video of the system. Before testing the voice function, first use MQTTfx to test baidu cloud connection, release, subscription is no problem, and then test sln-local2-iot combined with mimxrt1060-evk voice wake-up recognition and remote control functions.    For SLN-LOCAL2-IOT wifi hotspot join, enter the command in the print terminal: setup AWS kerry123456   4.1 MQTT.fx test baidu cloud connection MQTT.fx is an EclipsePaho-based MQTT client tool written in the Java language that supports subscription and publishing of messages through Topic.    4.1.1 MQTT fx configuration     Download and install the tool, then open it, at first, need to do the configuration, click edit connection: Pic19 Profile name:connect name Profile type: MQTT broker Broker address: It is the baidu could generated broker address, with 1883 no encryption transfer. Broker port:1883 No encryption Client ID: RT1060BTCDShadow, here need to note, this name should be the same as the could shadow name, otherwise, on the baidu webpage, the connection is not be detected. If this Client ID name is the same as the shadow name, then when the MQTT fx connect, the online side also can see the connection is OK. User credentials: add the thing User name and password from the baidu cloud. After the configuration, click connect, and refresh the website. Before conection: Pic 20 After connection: Pic 21 4.1.2 MQTT fx subscribe When it comes to subscription publishing, what is the topic of publishing subscriptions?  Here you can open your thing shadow, select the interaction, and see that the page has given the corresponding topic situation: Pic 22 Subscribe topic is: $baidu/iot/shadow/RT1060BTCDShadow/update/accepted  Publish topic is: $baidu/iot/shadow/RT1060BTCDShadow/update Pic 23 Click subscribe, we can see it already can used to receive the data.   4.1.3 MQTT fx publish Publish need to input the topic: $baidu/iot/shadow/RT1060BTCDShadow/update It also need to input the content, it will use the json content data. Pic 24 Here, we can use this json data: {   "reported" : {     "LEDstatus" : true,     "humid" : 88,     "temp" : 11    } } The json data also can use the website to check the data: https://www.bejson.com/jsonviewernew/ Pic 25 Input the publish data, and click pubish button: Pic 26 4.1.4 Publish data test result   Before publish, clean the website thing data: Pic 27 MQTT fx publish data, then check the subscribe data and the website situation: Pic 28 We can see, the published data also can be see in the website and the mqttfx subscribe area. Until now, the connection, data transfer test is OK.   4.2 Voice recognition and remote control test This is the device connection picture: Pic 29 4.2.1 voice recognition local control Pic 30 This is the SLN-LOCAL2-IOT print information after recognize the voice WW and VC. Red led on: led cycle: 4.2.2 voice recognition remote control   Following test, wakeup + remote on, wakeup+remote off, and also give the print result and the video. Pic 31 remote control:  
View full article
Introduction NXP i.MX RT1xxx series provide the High Assurance Boot (HAB) feature which makes the hardware to have a mechanism to ensure that the software can be trusted, as the HAB feature enables the ROM to authenticate the program image by using digital signatures, which can assure the application image's integrity, authenticated and undeniable. So the OEM can utilize it to make their product reject any system image which is not authorized to run. However, what's the trust chain of HAB for implementing the purpose? How the key and certificate generate In the installation directory of MCUXpresso Secure Provisioning:  ~\nxp\MCUX_Provi_v3.1\bin\tools_scripts\keys , there are scripts for generating keys: hab4_pki_tree.sh and hab4_pki_tree.bat (both are applicable to Linux and Windows systems respectively), running any of the above scripts will generate 13 pairs of public and private keys in sequence through OpenSSL, which constitute the below tree structure. Fig1 Key Tree structure The public key and private key generated by OpenSSL are paired one by one, saving the private key and publishing the corresponding public key to the outside world can easily implement asymmetric encryption applications. But how to ensure that the obtained public key is correct and has not been tampered with? At this time, the intervention of authoritative departments is required. Just like everyone can print their resume and say who they are, but if they have the seal of the Public Security Bureau, only the household registration book can prove you are you. This issued by the authority is called a certificate. What's in the certificate? Of course, it should contain a public key, which is the most important; there is also the owner of the certificate, just like the household registration book with your name and ID number, indicating that the book is yours; in addition, there is the issuer of the certificate and the validity period of the ID card is a bit like the issuer institution on the ID card, and how many years of the validity period. If someone fakes a certificate issued by an authority, it's like having fake ID cards and fake household registration books. To generate a certificate, you need to initiate a certificate request, and then send the request to an authority for certification, which is called a CA(Certificate Authority). After sending this request to the authority, the authority will give the certificate a signature. Another question arises, how can the signature be guaranteed to be signed by a genuine authority? Of course, it can only be signed with something that is only in the hands of the authority, which is the CA's private key. The signature algorithm probably works like this: a Hash calculation is performed on the target information to obtain a Hash value. And this process is irreversible, that is to say, the original information content cannot be obtained through the Hash value. When the information is sent out, the hash value is encrypted and sent together with the information as a signature. The process is as follows. Fig2 Signature and verification process Looking at the content of the certificate (as shown below), we will find that there is an Issuer, that is, who issued the certificate; The subject is to who the certificate is issued; Validity is the certificate period; Public-key is the content of the public key, and related signature algorithm. You will find that in order to verify the certificate, the public key of the CA is required. Then a new question arises. How can we be sure that the public key of the CA is correct? This requires a superior CA to sign the CA's public key, and then form the CA's certificate. If you want to know whether a CA's certificate is reliable, you need to see if the public key of the CA's superior certificate can unlock the CA's signature. Just likes if you don’t trust the District Public Security Bureau, you can call the Municipal Public Security Bureau and ask the Municipal Public Security Bureau to confirm its legitimacy of the District Public Security Bureau. This goes up layer by layer until the root CA makes the final endorsement. Through this layer-by-layer credit endorsement method, the normal operation of the asymmetric encryption mode is guaranteed. How does the Root CA prove itself? At this time, Root CA will issue another certificate (as shown below), called the Self-Signed Certificate, which is to sign itself with its own private key, giving people a feeling of "I am me, whether you believe it or not", Therefore, its format content is slightly different from the above CA certificate. Its Issuer and Subject are the same, and its own public key can be used for authentication. So the certificate authentication process will also end here. In this way, in addition to generating the public key and private key through running the script, the OpenSSL will also generate the certificate chain shown below.  Fig3 certificates Boot flow of the HAB mode Figure 4 shows the boot flow of the HAB mode. And steps 1, 2, and 3 are essentially the signature verification process. Fig4 Boot flow of the HAB mode The verification process (as shown in Figure 2) can be used to detect data integrity, identity authentication, and non-repudiation when the public key is trusted, so hab4_pki_tree.sh and hab4_pki_tree.bat scripts can ensure the generated public key and private key pair and the certificate are trusted, it's the "perfectly closed loop". However, the Application image in Figure 4 is plaintext, and the confidentiality of the data is not implemented, so the encrypted boot is always a combination of the HAB boot and the encrypted boot is an advanced usage of an authenticated boot. Reference AN4581: i.MX Secure Boot on HABv4 Supported Devices AN12681: How to use HAB secure boot in i.MX RT10xx  
View full article
1 Introduction    With the quick development of science and technology, the Internet of Things(IoT) is widely used in various areas, such as industry, agriculture, environment, transportation, logistics, security, and other infrastructure. IoT usage makes our lives more colorful and intelligent. The explosive development of the IoT cannot be separated from the cloud platform. At present, there are many types of cloud services on the market, such as Amazon's AWS, Microsoft's Azure, google cloud, China's Alibaba Cloud, Baidu Cloud, OneNet, etc.    Amazon AWS Cloud is a professional cloud computing service that is provided by Amazon. It provides a complete set of infrastructure and cloud solutions for customers in various countries and regions around the world. It is currently a cloud computing with a large number of users. AWS IoT is a managed cloud platform that allows connected devices to easily and securely interact with cloud applications and other devices.    NXP crossover MCU RT product has launched a series of AWS sample codes. This article mainly explains the remote_control_wifi_nxp code in the official MIMXRT1060-EVK SDK as an example to realize the data interaction with AWS IoT cloud, Android mobile APP, and MQTTfx client. The cloud topology of this article is as follows: Fig.1-1 2 AWS cloud operation 2.1 Create an AWS account Prepare a credit card, and then go to the below amazon link to create an AWS account:    https://console.aws.amazon.com/console/home   2.2 Create a Thing    Open the AWS IOT link: https://console.aws.amazon.com/iot    Choose the Things item under manage, if it is the first time usage, customer can choose “register a thing” to create the thing. If it is used in the previous time, customers can click the “create” button in the right corner to create the thing. Choose “create a single thing” to create the new thing, more details check the following picture. Fig. 2-1 Fig.2-2 Fig.2-3 2.3 Create certificate    Create a certificate for the newly created thing, click the “create certificate” button under the following picture: Fig.2-4    After the certificate is built, it will have the information about the certificate created, it means the certificate is generated and can be used. Fig. 2-5 Please note, download files: certificate for this thing, public key, private key. It will be used in the mqttfx tool configuration. Click “A root CA for AWS for Download”, download the root CA for AWS IoT, the mqttfx tool setting will also use it. Open the root CA download link, can download the CA certificate. RSA 2048 bit key: VeriSign Class 3 Public Primary G5 root CA certificate Fig. 2-6 At last, we can get these files: 7abfd7a350-certificate.pem.crt 7abfd7a350-private.pem.key 7abfd7a350-public.pem.key AmazonRootCA1.pem Save it, it will be used later. Click “active” button to active the certificate, and click “Done” button. The policy will be attached later.   2.4 Create Policies     Back to the iot view page: https://console.aws.amazon.com/iot/     Select the policies under Secure item, to create the new policies.  Fig. 2-7 Input the policy name, in the action area, fill: iot:*, Resource ARN area fill: * Check Allow item, click the create button to finish the new policy creation. Fig. 2-8 2.5 Things attach relationship     After the thing, certificate, policies creation, then will attach the policy to the certificate, and attach the certificate to the Things. Fig. 2-9 Choose the certificates under Secure item, in the related certificate item, choose “…”, you will find the down list, click “attach policy”, and choose the newly created policy. Then click attach thing, choose the newly created thing. Fig. 2-10 Fig. 2-11 Fig. 2-12 Now, open the Things under Mange item, check the detail things related information. Fig.2-13 Double click the thing, in the Interact item, we can find the Rest API Endpoint, the RT code and the mqttfx tool will use this endpoint to realize the cloud connection. Fig. 2-14 Check the security, you will find the previously created certificate, it means this thing already attach the new certificates: Fig. 2-15 Until now, we already finish the Things related configuration, then it will be used for the MQTT fx, Android app, RT EVK board connections, and testing, we also can check the communication information through the AWS shadow in the webpage directly.       3 Android related configuration 3.1 AWS cognito configuration    If use the Android app to communicate with the AWS IoT clould, the AWS side still needs to use the cognito service to authorize the AWS IoT, then access the device shadows. Create a new identity pools at first from the following link: https://console.aws.amazon.com/cognitohttps://console.aws.amazon.com/cognito Fig. 3-1 Click “manage Identity pools”, after enter it, then click “create new identity pool” Fig. 3-2 Fig. 3-3 Fig. 3-4 Here, it will generate two Roles: Cognito_PoolNameAuth_Role Cognito_PoolNameUnauth_Role Click Allow, to finish the identity pool creation. Fig. 3-5 Please record the related Identity pool ID, it will be used in the Android app .properties configuration files. 3.2 Create plicies in IAM for cognito   Open https://console.aws.amazon.com/iam   Click the “policies” item under “access management” Fig. 3-6 Choose “create policy”, create a IAM policies, in the Policy JSON area, write the following content: Fig. 3-7 { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "iot:Connect" ], "Resource": [ "*" ] }, { "Effect": "Allow", "Action": [ "iot:Publish" ], "Resource": [ "arn:aws:iot:us-east-1:965396684474:topic/$aws/things/RTAWSThing/shadow/update", "arn:aws:iot:us-east-1:965396684474:topic/$aws/things/RTAWSThing/shadow/get" ] }, { "Effect": "Allow", "Action": [ "iot:Subscribe", "iot:Receive" ], "Resource": [ "*" ] } ] }‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍ Please note, in the JSON content: "arn:aws:iot:<REGION>:<ACCOUNT ID>:topic/$aws/things/<THING NAME>/shadow/update", "arn:aws:iot:<REGION>:<ACCOUNT ID>:topic/$aws/things/<THING NAME>/shadow/get" Region:the us-east-1 inFig. 3-5 ACCOUNT ID, it can be found in the upper right corner my account side. Fig 3-8 Fig 3-9 After finished the IAM policy creation, then back to IAM policies page, choose Filter policies as customer managed, we can find the new created customer’s policy. Fig. 3-10 3.3 Attach policy for the cognito role in IAM   In IAM, choose roles item: Fig. 3-11 Double click the cognito_PoolNameUnauth_Role which is generated when creating the pool in cognito, click attach policies, select the new created policy. Fig. 3-12 Fig. 3-13 Until now, we already finish the AWS cognito configuration.   3.4  Android properties file configuration Create a file with .properties, the content is:     customer_specific_endpoint=<REST API ENDPOINT>     cognito_pool_id=<COGNITO POOL ID>     thing_name=<THING NAME>     region=<REGION> Please fill the correct content: REST API ENDPOINT:Fig 2-14 COGNITO POOL ID:fig 3-5 THING NAME:fig 2-14,upper left corner REGION:Fig 3-5, the region data in COGNITO POOL ID Take an example, my properties file content is:  customer_specific_endpoint=a215vehc5uw107-ats.iot.us-east-1.amazonaws.com  cognito_pool_id=us-east-1:c5ca6d11-f069-416c-81f9-fc1ec8fd8de5  thing_name=RTAWSThing  region=us-east-1 In the real usage, please use your own configured data, otherwise, it will connect to my cloud endpoint. 4. MQTTfx configuration and testing MQTT.fx is an MQTT client tool which is based on EclipsePaho and written in Java language. It supports subscribe and publish of messages through Topic. You can download this tool from the following link:   http://mqttfx.jensd.de/index.php/download    The new version is:1.7.1.   4.1 MQTT.fx configuration     Choose connect configuration button, then enter the connection configuration page: Fig. 4-1 Profile Name: Enter the configuration name Broker Address: it is REST API ENDPOINT。 Broker Port:8883 Client ID: generate it freely CA file: it is the downloaded CA certificate file Client Certificate File: related certificate file Client key File: private key file Check PEM formatted。 Click apply and OK to finish the configuration. 4.2 Use the AWS cloud to test connection   In order to test whether it can be connected to the event cloud, a preliminary connection test can be performed. Open the aws page: https://console.aws.amazon.com/iot here is a Test button under this interface, which can be tested by other clients or by itself.Both AWS cloud and MQTTfx subscribe topic: $aws/things/RTAWSThing/shadow/update MQTTfx publishes data to the topic: $aws/things/RTAWSThing/shadow/update It can be found that both the cloud test port and the MQTTfx subscribe can receive data: Fig. 4-2 Below, the Publish data is tested by the cloud, and then you can see that both the MQTTFX subscribe and the cloud subscribe can receive data: Fig. 4-3 Until now, the AWS cloud can transfer the data between the AWS iot cloud and the client. 5 RT1060 and wifi module configuration   We mainly use the RT1060 SDK2.8.0 remote_control_wifi_nxp as the RT test code: SDK_2.8.0_EVK-MIMXRT1060\boards\evkmimxrt1060\aws_examples\remote_control_wifi_nxp Test platform is:MIMXRT1060-EVK Panasonic PAN9026 SDIO ADAPTER + SD to uSD adapter The project is using Panasonic PAN9026 SDIO ADAPTER in default. 5.1 WIFI and the AWS code configuration    The project need the working WIFI SSID and the password, so prepare a working WIFI for it. Then add the SSID and the password in the aws_clientcredential.h #define clientcredentialWIFI_SSID       "Paste WiFi SSID here." #define clientcredentialWIFI_PASSWORD   "Paste WiFi password here." The connection for AWS also in file: aws_clientcredential.h #define clientcredentialMQTT_BROKER_ENDPOINT "a215vehc5uw107-ats.iot.us-east-1.amazonaws.com" #define clientcredentialIOT_THING_NAME       "RTAWSThing" #define clientcredentialMQTT_BROKER_PORT      8883   5.2 certificate and the key configuration Open the SDK following link: SDK_2.8.0_EVK-MIMXRT1060\rtos\freertos\tools\certificate_configuration\CertificateConfigurator.html Fig. 5-1 Generate the new aws_clientcredential_keys.h, and replace the old one. Take the MCUXPresso IDE project as an example, the file location is: Fig. 5-2 Build the project and download it to the MIMXRT1060-EVK board. 6 Test result Androd mobile phone download and install the APK under this folder: SDK_2.8.0_EVK-MIMXRT1060\boards\evkmimxrt1060\aws_examples\remote_control_android\AwsRemoteControl.apk SDK can be downloaded from this link: Welcome | MCUXpresso SDK Builder  Then, we can use the Android app to remote control the RT EVK on board LED, the test result is 6.1 APP and EVK test result MIMXRT1060-EVK printf information: Fig. 6-1 Turn on and turn off the led:   Fig. 6-2                                        Fig. 6-3 6.2 MQTTfx subscribe result MQTTfx subscribe data Turn on the led, we can subscribe two messages: Fig. 6-4 Fig. 6-5   Turn off the led, we also can subscribe two messages: Fig. 6-6 Fig. 6-7 In the two message, the first one is used to set the led status. The second one is the EVK used to report the EVK led information. MQTTfx also can use the publish page, publish this data: {"state":{"desired":{"LEDstate":1}}} or {"state":{"desired":{"LEDstate":0}}} To topic: $aws/things/RTAWSThing/shadow/update It also can realize the on board LED turn on or off. 6.3 AWS cloud shadows display result Turn on the led: Fig. 6-8 Turn off the led: Fig. 6-9 In conclusion, after the above configuration and testing, it can finish the Android mobile phone to remote control the RT EVK on board LED and get the information. Also can use the MQTTFX client tool and the AWS shadow page to check the communication data.
View full article
In the tutorial, I'd like to show the steps of deploying an image classification model on i.MX RT1060 to enabling you to classify fashion images and categories. In the first part of this tutorial, we will review the Fashion MNIST dataset, including how to download it to your system. From there we’ll define a simple CNN network using the TensorFlow platform. Next, we’ll train our CNN model on the Fashion MNIST dataset, train it, and review the results. Finally, we'll optimize the model, after that, the model will be smaller and increase inferencing speed, which is valuable for source-limited devices such as MCU. Let’s go ahead and get started! Fashion MNIST dataset The Fashion MNIST dataset was created by the e-commerce company, Zalando. Fig 1 Fashion MNIST dataset As they note on their official GitHub repo for the Fashion MNIST dataset, there are a few problems with the standard MNIST digit recognition dataset: It’s far too easy for standard machine learning algorithms to obtain 97%+ accuracy. It’s even easier for deep learning models to achieve 99%+ accuracy. The dataset is overused. MNIST cannot represent modern computer vision tasks. Zalando, therefore, created the Fashion MNIST dataset as a drop-in replacement for MNIST. 60,000 training examples 10,000 testing examples 10 classes: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, Ankle boot 28×28 grayscale images The code below loads the Fashion-MNIST dataset using the TensorFlow and creates a plot of the first 25 images in the training dataset. import tensorflow as tf import numpy as np # For easy reset of notebook state. tf.keras.backend.clear_session() # load dataset fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() lass_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] plt.figure(figsize=(8,8)) for i in range(25): plt.subplot(5,5,i+1,) plt.tight_layout() plt.imshow(train_images[i]) plt.xlabel(lass_names[train_labels[i]]) plt.xticks([]) plt.yticks([]) plt.grid(False) plt.show() Fig 2 Running the code loads the Fashion-MNIST train and test dataset and prints their shape. Fig 3 We can see that there are 60,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 28×28 pixels. Creating model We need to define a neural network model for the image classify purpose, and the model should have two main parts: the feature extraction and the classifier that makes a prediction. Defining a simple Convolutional Neural Network (CNN) For the convolutional front-end, we build 3 layers of convolution layer with a small filter size (3,3) and a modest number of filters followed by a max-pooling layer. The last filter map is flattened to provide features to the classifier. As we know, it's a multi-class classification task, so we will require an output layer with 10 nodes in order to predict the probability distribution of an image belonging to each of the 10 classes. In this case, we will require the use of a softmax activation function. And between the feature extractor and the output layer, we can add a dense layer to interpret the features. All layers will use the ReLU activation function and the He weight initialization scheme, both best practices. We will use the Adam optimizer to optimize the sparse_categorical_crossentropy loss function, suitable for multi-class classification, and we will monitor the classification accuracy metric, which is appropriate given we have the same number of examples in each of the 10 classes. The below code will define and run it will show the struct of the model. # Define a Model model = tf.keras.models.Sequential() # First Convolution ,Kernel:16*3*3 model.add( tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_uniform',input_shape=(28, 28, 1))) model.add( tf.keras.layers.MaxPooling2D((2, 2))) # Second Convolution ,Kernel:32*3*3 model.add( tf.keras.layers.Conv2D(32, (3, 3), activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.MaxPooling2D((2, 2))) # Third Convolution ,Kernel:32*3*3 model.add( tf.keras.layers.Conv2D(32, (3, 3), activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.Flatten()) model.add( tf.keras.layers.Dense(32, activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.Dense(10, activation='softmax')) Fig 4 Training Model After the model is defined, we need to train it. The model will be trained using 5-fold cross-validation. The value of k=5 was chosen to provide a baseline for both repeated evaluation and to not be too large as to require a long running time. Each validation set will be 20% of the training dataset or about 12,000 examples. The training dataset is shuffled prior to being split and the sample shuffling is performed each time so that any model we train will have the same train and validation datasets in each fold, providing an apples-to-apples comparison. We will train the baseline model for a modest 20 training epochs with a default batch size of 32 examples. The validation set for each fold will be used to validate the model during each epoch of the training run, so we can later create learning curves, and at the end of the run, we use the test dataset to estimate the performance of the model. As such, we will keep track of the resulting history from each run, as well as the classification accuracy of the fold. The train_model() function below implements these behaviors, taking the training dataset and test dataset as arguments, and returning a list of accuracy scores and training histories that can be later summarized. from sklearn.model_selection import KFold # train a model using k-fold cross-validation def train_model(dataX, dataY, n_folds=5): scores, histories = list(), list() # prepare cross validation kfold = KFold(n_folds, shuffle=True, random_state=1) for train_ix, validate_ix in kfold.split(dataX): # select rows for train and test trainX, trainY, validate_X, validate_Y = dataX[train_ix], dataY[train_ix], dataX[validate_ix], dataY[validate_ix] # fit model history = model.fit(trainX, trainY, epochs=20, batch_size=32, validation_data=(validate_X, validate_Y), verbose=0) # evaluate model _, acc = model.evaluate(validate_X, validate_Y, verbose=0) print("Accurary: {:.4f},Total number of figures is {:0>2d}".format(acc * 100.0, len(testY))) # append scores scores.append(acc) histories.append(history) return scores, histories Module Summary After the model has been trained, we can present the results. There are two key aspects to present: the diagnostics of the learning behavior of the model during training and the estimation of the model performance. These can be implemented using separate functions. First, the diagnostics involve creating a line plot showing model performance on the train and validate set during each fold of the k-fold cross-validation. These plots are valuable for getting an idea of whether a model is overfitting, underfitting, or has a good fit for the dataset. We will create a single figure with two subplots, one for loss and one for accuracy. Blue lines will indicate model performance on the training dataset and orange lines will indicate performance on the hold-out validate dataset. The summarize_diagnostics() function below creates and shows this plot given the collected training histories. # plot diagnostic learning curves def summarize_diagnostics(histories): for i in range(len(histories)): # plot loss plt.subplot(2,1,1) plt.title('Cross Entropy Loss') plt.plot(histories[i].history['loss'], color='blue', label='train') plt.plot(histories[i].history['val_loss'], color='orange', label='test') # plot accuracy plt.subplot(2,1,2) plt.title('Classification Accuracy') plt.plot(histories[i].history['accuracy'], color='blue', label='train') plt.plot(histories[i].history['val_accuracy'], color='orange', label='test') plt.show() Fig 5 Next, the classification accuracy scores collected during each fold can be summarized by calculating the mean and standard deviation. This provides an estimate of the average expected performance of the model trained on the test dataset, with an estimate of the average variance in the mean. We will also summarize the distribution of scores by creating and showing a box and whisker plot. The summarize_performance() function below implements this for a given list of scores collected during model training. # summarize model performance def summarize_performance(scores): # print summary print('Accuracy: mean={:.4f} std={:.4f}, n={:0>2d}'.format(np.mean(trained_scores)*100, np.std(trained_scores)*100, len(scores))) # box and whisker plots of results plt.boxplot(scores) plt.show()   Fig 6 Verifying predictions According to the above figure, we see that the final trained model can get up to around 87.6% accuracy when predicting the test dataset. And with the trained model, running the below code will demonstrate the result of predictions about some images. def plot_image(i, predictions_array, true_label, img): true_label, img = true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([]) plt.imshow(img.reshape(28, 28), cmap=plt.cm.binary) predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red' plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color) def plot_value_array(i, predictions_array, true_label): true_label = true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array) thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue') predictions = model.predict(test_images) # Plot the first X test images, their predicted labels, and the true labels. # Color correct predictions in blue and incorrect predictions in red. num_rows = 5 num_cols = 3 num_images = num_rows*num_cols plt.figure(figsize=(2*2*num_cols, 2*num_rows)) for i in range(num_images): plt.subplot(num_rows, 2*num_cols, 2*i+1) plot_image(i, predictions[i], test_labels, test_images) plt.subplot(num_rows, 2*num_cols, 2*i+2) plot_value_array(i, predictions[i], test_labels) plt.tight_layout() plt.show()   Fig 7 Model quantization Post-training quantization is a conversion technique that can reduce model size while also improving CPU and hardware accelerator latency, with little degradation in model accuracy, especially it's crucial to embedded platforms, as it lacks the compute-intensive performance, the Flash and RAM memory is also very limited. TensorFlow Lite is able to be used to convert an already-trained float TensorFlow model to the TensorFlow Lite format. In addition, the TensorFlow Lite provides several approaches to optimize the mode, among these ways, Integer quantization is an optimization strategy that converts 32-bit floating-point numbers (such as weights and activation outputs) to the nearest 8-bit fixed-point numbers. This results in a smaller model and increased inferencing speed, which is very valuable for low-power devices such as microcontrollers. The below codes show how to implement the Integer quantization of the trained model, and after running these codes, we can find that the size of Tensorflow Lite mode reduces almost 64.9 KB versus the original model, becomes about 32% of the original size(Fig 8). import os # Convert using integer-only quantization def representative_data_gen(): for input_value in tf.data.Dataset.from_tensor_slices(tf.cast(train_images,tf.float32)).shuffle(500).batch(1).take(150): yield [input_value] # Convert using dynamic range quantization converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] tflite_model_quant = converter.convert() # Save the model to disk open("model_dynamic_range_quantization.tflite", "wb").write(tflite_model_quant) ## Size difference Dynamic_range_quantization_model_size = os.path.getsize("model_dynamic_range_quantization.tflite") print("Dynamic range quantization model is %d bytes" % Dynamic_range_quantization_model_size) converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_data_gen # Ensure that if any ops can't be quantized, the converter throws an error converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] # Set the input and output tensors to uint8 (APIs added in r2.3) converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8 tflite_model_advanced_quant = converter.convert() # Save the model to disk open("model_integer_only_quantization.tflite", "wb").write(tflite_model_advanced_quant) Integer_only_quantization_model_size = os.path.getsize("model_integer_only_quantization.tflite") print("Integer_only_quantization_model is %d bytes" % Integer_only_quantization_model_size) difference = Dynamic_range_quantization_model_size - Integer_only_quantization_model_size print("Difference is %d bytes" % difference) Fig 8 Evaluating the TensorFlow Lite model Now we'll run inferences using the TensorFlow Lite Interpreter to compare the model accuracies. First, we need a function that runs inference with a given model and images, and then returns the predictions: # Helper function to run inference on a TFLite model def run_tflite_model(tflite_file, test_image_indices): # Initialize the interpreter interpreter = tf.lite.Interpreter(model_path=str(tflite_file)) interpreter.allocate_tensors() input_details = interpreter.get_input_details()[0] output_details = interpreter.get_output_details()[0] predictions = np.zeros((len(test_image_indices),), dtype=int) for i, test_image_index in enumerate(test_image_indices): test_image = test_images[test_image_index] test_label = test_labels[test_image_index] # Check if the input type is quantized, then rescale input data to uint8 if input_details['dtype'] == np.uint8: input_scale, input_zero_point = input_details["quantization"] test_image = test_image / input_scale + input_zero_point test_image = np.expand_dims(test_image, axis=0).astype(input_details["dtype"]) interpreter.set_tensor(input_details["index"], test_image) interpreter.invoke() output = interpreter.get_tensor(output_details["index"])[0] predictions[i] = output.argmax() return predictions Next, we'll compare the performance of the original model and the quantized model on one image. model_basic_quantization.tflite is the original TensorFlow Lite model with floating-point data. model_integer_only_quantization.tflite is the last model we converted using integer-only quantization (it uses uint8 data for input and output). Let's create another function to print our predictions and run it for testing. import matplotlib.pylab as plt # Change this to test a different image test_image_index = 1 ## Helper function to test the models on one image def test_model(tflite_file, test_image_index, model_type): global test_labels predictions = run_tflite_model(tflite_file, [test_image_index]) plt.imshow(test_images[test_image_index].reshape(28,28)) template = model_type + " Model \n True:{true}, Predicted:{predict}" _ = plt.title(template.format(true= str(test_labels[test_image_index]), predict=str(predictions[0]))) plt.grid(False) Fig 9 Fig 10 Then evaluate the quantized model by using all the test images we loaded at the beginning of this tutorial. After summarizing the prediction result of the test dataset, we can see that the prediction accuracy of the quantized model decrease 7% less than the original model, it's not bad. # Helper function to evaluate a TFLite model on all images def evaluate_model(tflite_file, model_type): test_image_indices = range(test_images.shape[0]) predictions = run_tflite_model(tflite_file, test_image_indices) accuracy = (np.sum(test_labels== predictions) * 100) / len(test_images) print('%s model accuracy is %.4f%% (Number of test samples=%d)' % ( model_type, accuracy, len(test_images))) Deploying model Converting TensorFlow Lite model to C file The following code runs xxd on the quantized model, writes the output to a file called model_quantized.cc, in the file, the model is defined as an array of bytes, and prints it to the screen. The output is very long, so we won’t reproduce it all here, but here’s a snippet that includes just the beginning and end. # Save the file as a C source file xxd -i model_integer_only_quantization.tflite > model_quantized.cc # Print the source file cat model_quantized.cc Fig 11 Deploying the C file to project We use the tensorflow_lite_cifar10 demo as a prototype, then replace the original model and do some code modification, below is the code in the modified main file. #include "board.h" #include "fsl_debug_console.h" #include "pin_mux.h" #include "timer.h" #include <iomanip> #include <iostream> #include <string> #include <vector> #include "tensorflow/lite/kernels/register.h" #include "tensorflow/lite/model.h" #include "tensorflow/lite/optional_debug_tools.h" #include "tensorflow/lite/string_util.h" #include "get_top_n.h" #include "model.h" #define LOG(x) std::cout // ---------------------------- Application ----------------------------- // Lenet Mnist model input data size (bytes). #define LENET_MNIST_INPUT_SIZE 28*28*sizeof(char) // Lenet Mnist model number of output classes. #define LENET_MNIST_OUTPUT_CLASS 10 // Allocate buffer for input data. This buffer contains the input image // pre-processed and serialized as text to include here. uint8_t imageData[LENET_MNIST_INPUT_SIZE] = { #include "clothes_select.inc" }; /* Tresholds */ #define DETECTION_TRESHOLD 60 /*! * @brief Initialize parameters for inference * * @param reference to flat buffer * @param reference to interpreter * @param pointer to storing input tensor address * @param verbose mode flag. Set true for verbose mode */ void InferenceInit(std::unique_ptr<tflite::FlatBufferModel> &model, std::unique_ptr<tflite::Interpreter> &interpreter, TfLiteTensor** input_tensor, bool isVerbose) { model = tflite::FlatBufferModel::BuildFromBuffer(Fashion_MNIST_model, Fashion_MNIST_model_len); if (!model) { LOG(FATAL) << "Failed to load model\r\n"; return; } tflite::ops::builtin::BuiltinOpResolver resolver; tflite::InterpreterBuilder(*model, resolver)(&interpreter); if (!interpreter) { LOG(FATAL) << "Failed to construct interpreter\r\n"; return; } int input = interpreter->inputs()[0]; const std::vector<int> inputs = interpreter->inputs(); const std::vector<int> outputs = interpreter->outputs(); if (interpreter->AllocateTensors() != kTfLiteOk) { LOG(FATAL) << "Failed to allocate tensors!"; return; } /* Get input dimension from the input tensor metadata assuming one input only */ *input_tensor = interpreter->tensor(input); auto data_type = (*input_tensor)->type; if (isVerbose) { const std::vector<int> inputs = interpreter->inputs(); const std::vector<int> outputs = interpreter->outputs(); LOG(INFO) << "input: " << inputs[0] << "\r\n"; LOG(INFO) << "number of inputs: " << inputs.size() << "\r\n"; LOG(INFO) << "number of outputs: " << outputs.size() << "\r\n"; LOG(INFO) << "tensors size: " << interpreter->tensors_size() << "\r\n"; LOG(INFO) << "nodes size: " << interpreter->nodes_size() << "\r\n"; LOG(INFO) << "inputs: " << interpreter->inputs().size() << "\r\n"; LOG(INFO) << "input(0) name: " << interpreter->GetInputName(0) << "\r\n"; int t_size = interpreter->tensors_size(); for (int i = 0; i < t_size; i++) { if (interpreter->tensor(i)->name) { LOG(INFO) << i << ": " << interpreter->tensor(i)->name << ", " << interpreter->tensor(i)->bytes << ", " << interpreter->tensor(i)->type << ", " << interpreter->tensor(i)->params.scale << ", " << interpreter->tensor(i)->params.zero_point << "\r\n"; } } LOG(INFO) << "\r\n"; } } /*! * @brief Runs inference input buffer and print result to console * * @param pointer to image data * @param image data length * @param pointer to labels string array * @param reference to flat buffer model * @param reference to interpreter * @param pointer to input tensor */ void RunInference(const uint8_t* image, size_t image_len, const std::string* labels, std::unique_ptr<tflite::FlatBufferModel> &model, std::unique_ptr<tflite::Interpreter> &interpreter, TfLiteTensor* input_tensor) { /* Copy image to tensor. */ memcpy(input_tensor->data.uint8, image, image_len); /* Do inference on static image in first loop. */ auto start = GetTimeInUS(); if (interpreter->Invoke() != kTfLiteOk) { LOG(FATAL) << "Failed to invoke tflite!\r\n"; return; } auto end = GetTimeInUS(); const float threshold = (float)DETECTION_TRESHOLD /100; std::vector<std::pair<float, int>> top_results; int output = interpreter->outputs()[0]; TfLiteTensor *output_tensor = interpreter->tensor(output); TfLiteIntArray* output_dims = output_tensor->dims; // assume output dims to be something like (1, 1, ... , size) auto output_size = output_dims->data[output_dims->size - 1]; /* Find best image candidates. */ GetTopN<uint8_t>(interpreter->typed_output_tensor<uint8_t>(0), output_size, 1, threshold, &top_results, false); if (!top_results.empty()) { auto result = top_results.front(); const float confidence = result.first; const int index = result.second; if (confidence * 100 > DETECTION_TRESHOLD) { LOG(INFO) << "----------------------------------------\r\n"; LOG(INFO) << " Inference time: " << (end - start) / 1000 << " ms\r\n"; LOG(INFO) << " Detected: " << std::setw(10) << labels[index] << " (" << (int)(confidence * 100) << "%)\r\n"; LOG(INFO) << "----------------------------------------\r\n\r\n"; } } } /*! * @brief Main function */ int main(void) { const std::string labels[] = {"T-shirt/top", "Trouser","Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"}; /* Init board hardware. */ BOARD_ConfigMPU(); BOARD_InitPins(); BOARD_BootClockRUN(); BOARD_InitDebugConsole(); InitTimer(); std::unique_ptr<tflite::FlatBufferModel> model; std::unique_ptr<tflite::Interpreter> interpreter; TfLiteTensor* input_tensor = 0; InferenceInit(model, interpreter, &input_tensor, false); LOG(INFO) << "Fashion MNIST object recognition example using a TensorFlow Lite model.\r\n"; LOG(INFO) << "Detection threshold: " << DETECTION_TRESHOLD << "%\r\n"; /* Run inference on static ship image. */ LOG(INFO) << "\r\nStatic data processing:\r\n"; RunInference((uint8_t*)imageData, (size_t)LENET_MNIST_INPUT_SIZE, labels, model, interpreter, input_tensor); while(1) {} } Testing result After deploying the model in the demo project, then we'll run this demo on the MIMXRT1060 (Fig 12) board for testing. Fig 12 Run the below code to covert the Fashion MNIST image to text The process_image() function can convert a Fashion MNIST image to an include file as static data, then include this file in the demo project. def process_image(image, output_path, num_batch=1): img_data = np.transpose(image, (2, 0, 1)) # Repeat image for batch processing (resulting tensor is NCHW or NHWC) img_data = np.reshape(img_data, (num_batch, img_data.shape[0], img_data.shape[1], img_data.shape[2])) img_data = np.repeat(img_data, num_batch, axis=0) img_data = np.reshape(img_data, (num_batch, img_data.shape[1], img_data.shape[2], img_data.shape[3])) # Serialize image batch img_data_bytes = bytearray(img_data.tobytes(order='C')) image_bytes_per_line = 20 with open(output_path, 'wt') as f: idx = 0 for byte in img_data_bytes: f.write('0X%02X, ' % byte) if idx % image_bytes_per_line == (image_bytes_per_line - 1): f.write('\n') idx = idx + 1 # Return serialized image size return len(img_data_bytes)      2. Run the demo project on board.
View full article
1. Abstract     When the customer uses the NXP RT board to debug the code, sometimes, the board   suddenly meets debug connect issues when using the IDE and the debugger to download the code. Especially to the customer who is using the NXP RT EVK board with the onboard default CMSIS-DAP debugger. They even suspect the board is broken after trying a lot of times.     The debugger connects issues that normally happen when using the wrong FDCB, the download process is ended abnormally, the wrong flash loader, or the abnormal app code in the flash, .etc.     The reported issue result will be similar to the following pictures: Fig.1 Fig.2     The connect log maybe:       No connection to chip’s debug port       Error: Wire Ack Wait Fault     This document will give some methods to recover the board when meeting the debugger issues with the typical MIMXRT1060-EVK board +MCUXpresso IDE as the test platform. Other platform is also similar, the recovery method also can be used. 2. RT board recovery method     The main method is to change the RT board to the serial download mode, bring the core to a known state, then do the mass erase in IDE or with the MCUXpresso Secure Provisioning Tool(SPT).     First, let the board enter the serial download mode: Fig.3     Enter serial download mode:     1) SW7: 1-OFF,2-OFF,3-OFF,4-ON     2) Power off and power on again, or Press Reset button 2.1 IDE Mass Erase     MCUXpresso IDE, choose the related debugger interface, then choose “erase flash action”.     Here take the MIMXRT1060-EVK on board default debugger CMSIS DAP as an example: Fig.4 Fig.5    After this operation, when the Board is back to the internal boot again, the debugger can download to the flash again. 2.2 SPT Mass Erase     Customers also can use the NXP MCUXpresso Secure Provisioning Tool to do the code download or the mass erase in the serial download mode, this method also can recover the board, in fact, just let the core in the known status.     SPT tool download link: https://www.nxp.com/design/software/development-software/mcuxpresso-secure-provisioning-tool:MCUXPRESSO-SECURE-PROVISIONING     After installing the SPT tool, open it.     1) Create one RT1060 workspace Fig.6     2) Connect the board with USB or the UART     Here, take USB interface as an example. Fig.7 Fig. 8 Fig.9 Fig.10 Fig.11       Until now, the external flash is erased!     In the SPT, the customer also can double check the memory, especially whether the FDCB area is erased or not, just like in the following picture: Fig.12     The customer can go back to the internal boot mode and use the debugger to download the code again.       Internal boot mode:           SW7:1-OFF,2-OFF,3-ON,4-OFF      Press reset or power off and power on again to enter the internal boot mode, then use the debugger to test it again, this is the result: Fig.13     We can see, the MIMXRT1060-EVK debugger interface is recovered! 3. Conclusion     When the flash contains an app that is abnormal(access memory does not exist, memory is corrupted, misconfiguration of the clocks, etc.), it will cause the board to end up in an unknown state, then the debugger can’t take control over the core. But, when put the core in serial downloader mode, then it will put the core in a known state, this way, the debugger will be able to take control of the core.     So, when meeting the debugger issues in the RT board, try to mass erase the external flash in serial download mode, then it will recover the board debugger to a normal situation.
View full article
LittleFS is a file system used for microcontroller internal flash and external NOR flash. Since it is more suitable for small embedded systems than traditional FAT file systems, more and more people are using it in their projects. So in addition to NOR/NAND flash type storage devices, can LittleFS be used in SD cards? It seems that it is okay too. This article will use the littlefs_shell and sdcard_fatfs demo project in the i.mxRT1050 SDK to make a new littefs_shell project for reading and writing SD cards. This experiment uses MCUXpresso IDE v11.7, and the SDK uses version 2.13. The littleFS file system has only 4 files, of which the current version shown in lfs.h is littleFS 2.5. The first step, of course, is to add SD-related code to the littlefs_shell project. The easiest way is to import another sdcard_fatfs project and copy all of the sdmmc directories into our project. Then copy sdmmc_config.c and sdmmc_config.h in the /board directory, and fsl_usdhc.c and fsl_usdhc.h in the /drivers directory. The second step is to modify the program to include SD card detection and initialization, adding a bridge from LittleFS to SD drivers. Add the following code to littlefs_shell.c. extern sd_card_t m_sdCard; status_t sdcardWaitCardInsert(void) { BOARD_SD_Config(&m_sdCard, NULL, BOARD_SDMMC_SD_HOST_IRQ_PRIORITY, NULL); /* SD host init function */ if (SD_HostInit(&m_sdCard) != kStatus_Success) { PRINTF("\r\nSD host init fail\r\n"); return kStatus_Fail; } /* wait card insert */ if (SD_PollingCardInsert(&m_sdCard, kSD_Inserted) == kStatus_Success) { PRINTF("\r\nCard inserted.\r\n"); /* power off card */ SD_SetCardPower(&m_sdCard, false); /* power on the card */ SD_SetCardPower(&m_sdCard, true); // SdMmc_Init(); } else { PRINTF("\r\nCard detect fail.\r\n"); return kStatus_Fail; } return kStatus_Success; } status_t sd_disk_initialize() { static bool isCardInitialized = false; /* demostrate the normal flow of card re-initialization. If re-initialization is not neccessary, return RES_OK directly will be fine */ if(isCardInitialized) { SD_Deinit(&m_sdCard); } if (kStatus_Success != SD_Init(&m_sdCard)) { SD_Deinit(&m_sdCard); memset(&m_sdCard, 0U, sizeof(m_sdCard)); return kStatus_Fail; } isCardInitialized = true; return kStatus_Success; } In main(), add these code if (sdcardWaitCardInsert() != kStatus_Success) { return -1; } status = sd_disk_initialize(); Next, create two new c files, lfs_sdmmc.c and lfs_sdmmc_bridge.c. The call order is littlefs->lfs_sdmmc.c->lfs_sdmmc_bridge.c->fsl_sd.c. lfs_sdmmc.c and lfs_sdmmc_bridge.c acting as intermediate layers that can connect the LITTLEFS and SD upper layer drivers. One of the things that must be noted is the mapping of addresses. The address given by littleFS is the block address + offset address. See figure below. This is a read command issued by the ‘mount’ command. The block address refers to the address of the erased sector address in SD. The read and write operation uses the smallest read-write block address (BLOCK) of SD, as described below. Therefore, in lfs_sdmmc.c, the address given by littleFS is first converted to the byte address. Then change the SD card read-write address to the BLOCK address in lfs_sdmmc_bridge.c. Since most SD cards today exceed 4GB, the byte address requires a 64-bit variable. Finally, the most important step is littleFS parameter configuration. There is a structure LittlsFS_config in peripherals.c, which contains not only the operation functions of the SD card, but also the read and write sectors and cache size. The setup of this structure is critical. If the setting is not good, it will not only affect the performance, but also cause errors in operation. Before setting it up, let's introduce some of the general ideal of SD card and littleFS. The storage unit of the SD card is BLOCK, and both reading and writing can be carried out according to BLOCK. The size of each block can be different for different cards. For standard SD cards, the length of the block command can be set with CMD16, and the block command length is fixed at 512 bytes for SDHC cards. The SD card is erased sector by sector. The size of each sector needs to be checked in the CSD register of the SD card. If the CSD register ERASE_BLK_EN = 0, Sector is the smallest erase unit, and its unit is "block". The value of sector size is equal to the value of the SECTOR_SIZE field in the CSD register plus 1. For example, if SECTOR_SIZE is 127, then the minimum erase unit is 512*(127+1)=65536 bytes. In addition, sometimes there are doubts, many of the current SD cards actually have wear functions to reduce the loss caused by frequent erasing and writing, and extend the service life. So in fact, delete operations or read and write operations are not necessarily real physical addresses. Instead, it is mapped by the SD controller. But for the user, this mapping is transparent. So don't worry about this affecting normal operation. LittleFS is a lightweight file system that has power loss recovery and dynamic wear leveling compared to FAT systems. Once mounted, littleFS provides a complete set of POSIX-like file and directory functions, so it can be operated like a common file system. LittleFS has only 4 files in total, and it basically does not need to be modified when used. Since the NOR/NAND flash to be operated by LittleFS is essentially a block device, in order to facilitate use, LittleFS is read and written in blocks, and the underlying NOR/NAND Flash interface drivers are carried out in blocks. Let's take a look at the specific content of LittleFS configuration parameters. const struct lfs_config LittleFS_config = { .context = (void*)0, .read = lfs_sdmmc_read, .prog = lfs_sdmmc_prog, .erase = lfs_sdmmc_erase, .sync = lfs_sdmmc_sync, .read_size = 512, .prog_size = 512, .block_size = 65536, .block_count = 128, .block_cycles = 100, .cache_size = 512, .lookahead_size = LITTLEFS_LOOKAHEAD_SIZE }; Among them, the first item (.context) is not used in this project, and is used in the original project to save the offset of the file system stored in Flash. Items two (.read) through five (.sync) point to the handlers for each operation. The sixth item (.read_size) is the smallest unit of read operation. This value is roughly equal to the BLOCK size of the SD card. In the SD card driver, this size has been fixed to 512. So for convenience, it is also set to 512. The seventh item (.prog_size) is the number of bytes written each time, which is 512 bytes like .read_size. The eighth item is .block_size. This can be considered to be the smallest erase block supported by the SD card when performing an erase operation. Here the default value is not important, you need to set it in the program according to the actual value after the SD card is initialized. The card used in this experiment is 64k bytes as an erase block, so 65536 is used directly here. Item 9 (.block_count) is used to indicate how many erasable blocks there are. Multiply the .block_size to get the size of the card. If the card is replaceable, it needs to be determined according to the parameters after the SD card is initialized. The tenth item (.block_cycles) is the number of erase cycles per block. Item 11 (.cache_size) is about the cache buffer. It feels like bigger is better, but actually modifies this value won't work. So still 512. Item 12 (lookahead_size), littleFS uses a lookahead buffer to manage and allocate blocks. A lookahead buffer is a fixed-size bitmap that records information about block allocations within an area. The lookahead buffer only records the information of block allocations in one area, and when you need to know the allocation of other regions, you need to scan the file system to find allocated blocks. If there are no free blocks in the lookahead buffer, you need to move the lookahead buffer to find other free blocks in the file system. The lookahead buffer position shifts one lookahead_size at a time. Use the original value here.  That’s all for the porting work. We can test the project now. You can see it works fine. The littleFS-SD project can read/write/create folder and erase. And it also support append to an exist file. But after more testing, a problem was found, if you repeatedly add->-close->-add-> close a file, the file will open more and more slowly, even taking a few seconds. This is what should be added and is not written directly in the last block of the file, but will apply for a new block, regardless of whether the previous block is full or not. See figure below. The figure above prints out all the read, write, and erase operations used in each write command. You can see that each time in the lfs_file_open there is one more read than the last write operation. In this way, after dozens or hundreds of cycles, a file will involve many blocks. It is very time-consuming to read these blocks in turn. Tests found that more than 100 read took longer than seconds. To speed things up, it is recommended to copy the contents of one file to another file after adding it dozens of times. In this way, the scattered content will be consolidated to write a small number of blocks. This can greatly speed up reading and writing.
View full article