i.MX RT Knowledge Base

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

i.MX RT Knowledge Base

Discussions

Sort by:
RT106L_S voice control system based on the Baidu cloud 1 Introduction     The NXP RT106L and RT106S are voice recognition chip which is used for offline local voice control, SLN-LOCAL-IOT is based on RT106L, SLN-LOCAL2-IOT is a new local speech recognition board based on RT106S. The board includes the murata 1DX wifi/BLE module, the AFE voice analog front end, the ASR recognition system, the external flash, 2 microphones, and the analog voice amplifier and speakers. The voice recognition process for SLN-LOCAL-IOT and SLN-LOCAL2-IOT is different and the new SLN-LOCAL2-IOT is recommended.     This article is based on the voice control board SLN-LOCAL/2-IOT to implement the following block diagram functions: Pic 1 Use the PC-side speed model tool (Cyberon DSMT) to generate WW(wake word) and VC(voice command) Command related voice engine binary files , which will be used by the demo code. This system is mainly used for the Chinese word recognition, when the user says Chinese word: "小恩小恩", it wakes up SLN-LOCAL/2-IOT, and the board gives feedback "小恩来了,请吩咐". Then system enter the voice recognition stage, the user can say the voice recognition command: “开红灯”,“关红灯”,“开绿灯”,“关绿灯”,“灯闪烁”,“开远程灯”,“关远程灯”, after recognition, the board gives feedback "好的". Among them, “开红灯”,“关红灯”,“开绿灯”,“关绿灯”,“灯闪烁”,the five commands are used for the local light switch, while the 开远程灯”,“关远程灯“two commands can through network communication Baidu cloud control the additional MIMXRT1060-EVK development board light switch. SLN-LOCAL/2-IOT through the WIFI module access to the Internet with MQTT protocol to achieve communication with Baidu cloud, when dectect the remote control command, publish the json packets to Baidu cloud, while MIMRT1060-EVK subscribe Baidu cloud data, will receive data from the IOT board and analyze the EVK board led control. PC side can use MQTT.fx software to subscribe the Baidu cloud data, it also can send data to the device to achieve remote control function directly.  Now, will give the detail content about how to use the SLN-LOCAL/2-IOT SDK demo realize the customized Chinese wake command and voice command, and remote control the MIMXRT1060-EVK through the Baidu Cloud.     2 Platform establish 2.1 Used platform SLN-LOCAL-IOT/SLN-LOCAL2-IOT MIMXRT1060-EVK MQTT.fx SDK_2_8_0_SLN-LOCAL2-IOT MCUXPresso IDE Segger JLINK Baidu Smart Cloud: Baidu cloud control+ TTS Audacity:audio file format convert tool WAVToCode:wav convert to the c array code, which used for the demo tilte play MCUBootUtility: used to burn the feedback audio file to the filesystem Cyberon DSMT: wake word and voice detect command generation tool DSMT is the very important tool to realize the wake word and voice dection, the apply follow is: Pic 2 2.2 Baidu Smart cloud 2.2.1 Baidu cloud IOT control system Enter the IoT Hub: https://cloud.baidu.com/product/iot.html     Click used now. 2.2.1.1 Create device project Create a project, select the device type, and enter the project name. Device types can use shadows as images of devices in the cloud to see directly how data is changing. Once created, an endpoint is generated, along with the corresponding address: Pic 3 2.2.1.2 Create Thing model The Thing model is mainly to establish various properties needed in the shadow, such as temperature, humidity, other variables, and the type of value given, in fact, it is also the json item in the actual MQTT communication.    Click the newly created device-type project where you can create a new thing model or shadow: Pic 4    Here create 3 attributes:LEDstatus,humid,temp It is used to represent the led status, humidity, temperature and so on, which is convenient for communication and control between the cloud and RT board. Once created, you get the following picture:   Pic 5   2.2.1.3 Create Thing shadow In the device-type project, you can select the shadow, build your own shadow platform, enter the name, and select the object model as the newly created Thing model containing three properties, after the create, we can get the details of the shadow:   Pic 6 At the same time will also generate the shadow-related address, names and keys, my test platform situation is as follows: TCP Address: tcp://rndrjc9.mqtt.iot.gz.baidubce.com:1883 SSL Address: ssl://rndrjc9.mqtt.iot.gz.baidubce.com:1884 WSS Address: wss://rndrjc9.mqtt.iot.gz.baidubce.com:443 name: rndrjc9/RT1060BTCDShadow key: y92ewvgjz23nzhgn Port 1883, does not support transmission data encryption Port 1884, supports SSL/TLS encrypted transmission Port 8884, which supports wesockets-style connections, also contains SSL encryption. This article uses a 1883 port with no transmission data encryption for easy testing. So far, Baidu cloud device-type cloud shadow has been completed, the following can use MQTTfx tools to connect and test. In practice, it is recommended that customers build their own Baidu cloud connection, the above user key is for reference only.   2.2.2 Online TTS    SLN-LOCAL/2-IOT board recognizes wake-up words, recognition words, or when powering on, you need to add corresponding demo audio, such as: "百度云端语音测试demo ", "小恩来啦!请吩咐“,"好的". These words need to do a text-to-wav audio file synthesis, here is Baidu Smart Cloud's online TTS function, the specific operation can refer to the following documents: https://ai.baidu.com/ai-doc/SPEECH/jk38y8gno   Once the base audio library is opened, use the main.py provided in the link above and modify it to add the Chinese field you want to convert to the file "TEXT" and add the audio file to be converted in "save_file" such as xxx .wav, using the command: python main.py to complete the conversion, and generate the audio format corresponding to the text, such as .mp3, .wav. Pic 7   After getting the wav file, it can’t be used directly, we need to note that for SLN-LOCAL/2-IOT board, you need to identify the audio source of the 48K sample rate with 16bit, so we need to use the Audioacity Audio tool to convert the audio file format to 48K16bit wav. Import 16K16bit wav files generated by Baidu TTS into the Audioacity tool, select project rate of 48Khz, file->export->export as WAV, select encoding as signed 16bit PCM, and regenerate 48Khz16bit wav for use. Pic 8 “百度云端语音测试demo“:Used for power-on broadcasting, demo name broadcasting, it is stored in RT demo code, so you need to convert it to a 16bit C code array and add it to the project. "小恩来啦!请吩咐",“好的“:voice detect feedback, it is saved in the filesystem ZH01,ZH02 area. 2.3 playback audio data prepare and burn   There are two playback audio file, it is "小恩来啦!请吩咐",“好的“,it is saved in the filesystem ZH01,ZH02 area. Filesystem memory map like this: Pic 9 So, we need to convert the 48K16bit wav file to the filesystem needed format, we need to use the official tool::Ivaldi_sln_local2_iot Reference document:SLN-LOCAL2-IOT-DG chapter 10.1 Generating filesystem-compatible files Use bash input the commands like the following picture: Pic10 Use the convert command to get the playback bin file: python file_format.py -if xiaoencoming_48k16bit.wav -of xiaoencoming_48k16bit.bin -ft H At last, it will generate the file: "小恩来啦!请吩咐"->xiaoencoming_48k16bit.bin,burn to flash address 0x6184_0000 “好的”->OK_48k16bit.bin, burn to flash address 0x6180_0000 Then, use MCUBootUtility tool burn the above two file to the related images. Here, take OK_48k16bit.bin as an example, demo enter the serial download mode(J27-0), power off and power on. Flash chip need to select hyper flash IS26KSXXS, use the boot device memory windows, write button to burn the .bin file to the related address, length is 0X40000 Pic11 Pic12 xiaoencoming_48k16bit.bin can use the same method to download to 0x6184_0000,Length is 0X40000.   2.4 Demo audio prepare and add The prepared baiduclouddemo_48K16bit.wav(“百度云端语音测试demo “) need to convert to the 16bit C array code, and put to the project code, calls by the code, this is used for the demo mode play. The convert need to use the WAVToCode, the operation like this: Pic 13 The generated baiducloulddemo_48K16bit.c,add it to the demo project C files: sln_local_iot_local_demo->audio->demos->smart_home.c。 2.5 WW and VC prepare Wake-up word are generated through the cyberon DSMT tool, which supports a wide range of language, customers can request the tool through Figure 2. The Chinese wake-up words and voice command words in this article are also generated through DSMT. DSMT can have multiple groups, group1 as a wake-up word configuration, CmdMapID s 1. Other groups act as voice command words, such as CMD-IOT in this article, cmdMapID=2. Pic 14   Pic 15 Wake word continuously detects the input audio stream, uses group1, and if successfully wakes up, will do the voice command detection uses group2, or other identifying groups as well as custom groups. The wake-up words using the DSMT tool, the configuration are as follows: Pic 16 The WW can support more words, customer can add the needed one in the group 1. Use the DSMT configure VC like this: Pic 17 Then, save the file, code used file are: _witMapID.bin, CMD_IOT.xml,WW.xml. In the generated files, CYBase.mod is the base model, WW.mod is the WW model, CMD_IOT.mod is the VC model. After Pic 16,17, it finishes the WW and VC command prepare, we can put the DSMT project to the RT106S demo project folder: sln_local2_iot_local_demo\local_voice\oob_demo_zh 3 Code prepare Based on the official SLN-LOCAL2-IOT SDK local_demo, the code in this article modifies the Chinese wake-up words and recognition words (or you can build a new customer custom group directly), add local voice detect the led status operations, Then feedback Chinese audio, demo Chinese audio, Wifi network communication MQTT protocol code, and Baidu cloud shadow connection publish. Source reference code SDK path: SDK_2_8_0_SLN-LOCAL2-IOT\boards\sln_local2_iot\sln_voice_examples\local_demo   SDK_2_8_0_SLN-LOCAL2-IOT\boards\sln_local2_iot\sln_boot_apps SLN-LOCAL2-IOT and SLN-LOCAL-IOT code are nearly the same, the only difference is that the ASR library file is different, for RT106S (SLN-LOCAL2-IOT) using SDK it’s own libsln_asr.a library, for RT106L (SLN-LOCAL-IOT) need to use the corresponding libsln_asr_eval.a library.    Importing code requires three projects: local_demo, bootloader, bootstrap. The three projects store in different spaces. See SLN-LOCAL2-IOT-DG .pdf, chapter 3.3 Device memory map    This is the 3 chip project boot process: Pic 18 This document is for demo testing and requires debug, so this article turns off the encryption mechanism, configures bootloader, bootstrap engineering macro definition: DISABLE_IMAGE_VERIFICATION = 1, and uses JLINK to connect SLN-LOCAL/2-IOT's SWD interface to burn code. The following is to add modification code for app local_demo projects. 3.1 sln-local/2-iot code Sln-local-iot, sln-local2-iot platform, the following modification are the same for the two platform. 3.1.1 Voice recognition related code 1)Demo audio play Play content:“百度云端语音测试demo“ sln_local2_iot_local_demo_xe_ledwifi\audio\demos\ smart_home.c content is replaced by the previously generated baiducloulddemo_48K16bit.C. audio_samples.h,modify: #define SMART_HOME_DEMO_CLIP_SIZE 110733 This code is used for the main.c announce_demo API play:         case ASR_CMD_IOT:             ret = demo_play_clip((uint8_t *)smart_home_demo_clip, sizeof(smart_home_demo_clip));   2)command print information #define NUMBER_OF_IOT_CMDS      7 IndexCommands.h static char *cmd_iot_en[] = {"Red led on", "Red led off", "Green led on", "Green led off",                              "cycle led",        "remote led on",         "remote led off"}; static char *cmd_iot_zh[] = {"开红灯", "关红灯", "开绿灯", "关绿灯", "灯闪烁", "开远程灯", "关远程灯"}; Here is the source code modification using IOT, you can actually add your own speech recognition group directly, and add the relevant command identification.   3)sln_local_voice.c Line757 , add led-related notification information in ASR_CMD_IOT mode. oob_demo_control.ledCmd = g_asrControl.result.keywordID[1];     The code is used to obtain the recognized VC command data, and the value of keywordID[1] represents the number. This number can let the code know which detail voice is detected. so that you can do specific things in the app based on the value of ledcmd. The value of keywordID[1] corresponds to Command List in Figure 17. For example, “开远程灯“, if woke up, and recognized "开远程灯", then keywordID[1] is 5, and will transfer to oob_demo_control.ledCmd, which will be used in the appTask API to realize the detail control. 4) main.c void appTask(void *arg) Under case kCommandGeneric: if the language is Chinese, then add the recognition related control code, at first, it will play the feedback as “好的”. Then, it will check the voice detect value, give the related local led control. else if (oob_demo_control.language == ASR_CHINESE) { // play audio "OK" in Chinese #if defined(SLN_LOCAL2_RD) ret = audio_play_clip((uint8_t *)AUDIO_ZH_01_FILE_ADDR, AUDIO_ZH_01_FILE_SIZE); #elif defined(SLN_LOCAL2_IOT) ret = audio_play_clip(AUDIO_ZH_01_FILE); #endif //kerry add operation code==================================================begin RGB_LED_SetColor(LED_COLOR_OFF); if (oob_demo_control.ledCmd == LED_RED_ON) { RGB_LED_SetColor(LED_COLOR_RED); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_RED_OFF) { RGB_LED_SetColor(LED_COLOR_OFF); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_BLUE_ON) { RGB_LED_SetColor(LED_COLOR_BLUE); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == LED_BLUE_OFF) { RGB_LED_SetColor(LED_COLOR_OFF); vTaskDelay(5000); } else if (oob_demo_control.ledCmd == CYCLE_SLOW) { for (int i = 0; i < 3; i++) { RGB_LED_SetColor(LED_COLOR_RED); vTaskDelay(400); RGB_LED_SetColor(LED_COLOR_OFF); RGB_LED_SetColor(LED_COLOR_GREEN); vTaskDelay(400); RGB_LED_SetColor(LED_COLOR_OFF); RGB_LED_SetColor(LED_COLOR_BLUE); vTaskDelay(400); } } … } In addition to local voice recognition control, this article also add remote control functions, mainly through wifi connection, use the mqtt protocol to connect Baidu cloud server, when local speech recognition get the remote control command, it publish the corresponding control message to Baidu cloud, and then the cloud send the message to the client which subscribe this message,  after the client get the message, it will refer to the message content do the related control.   3.1.3 Network connection code 1)sln_local2_iot_local_demo_xe_ledwifi\lwip\src\apps\mqtt     Add mqtt.c 2)sln_local2_iot_local_demo_xe_ledwifi\lwip\src\include\lwip\apps Add mqtt.h, mqtt_opts.h,mqtt_prv.h The related mqtt driver is from the RT1060 sdk, which already added in the attachment project. 3)sln_tcp_server.c   Add MQTT application layer API function code, client ID, server host, MQTT server port number, user name, password, subscription topic, publishing topic and data, etc., more details, check the attachment code.    The MQTT application code is ported from the mqtt project of the RT1060 SDK and added to the sln_tcp_server.c. TCP_OTA_Server function is used to initialize the wifi network, realize wifi connection, connect to the network, resolve Baidu cloud server URL to get IP, and then connect Baidu cloud server through mqtt, after the successful connection, publish the message at first, so that after power-up through mqttfx to see whether the power on network publishing message is successful. TCP_OTA_Server function code is as follows: static void TCP_OTA_Server(void *param) //kerry consider add mqtt related code { err_t err = ERR_OK; uint8_t status = kCommon_Failed; #if USE_WIFI_CONNECTION /* Start the WiFi and connect to the network */ APP_NETWORK_Init(); while (status != kCommon_Success) { status_t statusConnect; statusConnect = APP_NETWORK_Wifi_Connect(true, true); if (WIFI_CONNECT_SUCCESS == statusConnect) { status = kCommon_Success; } else if (WIFI_CONNECT_NO_CRED == statusConnect) { APP_NETWORK_Uninit(); /* If there are no credential in flash delete the TPC server task */ vTaskDelete(NULL); } else { status = kCommon_Failed; } } #endif #if USE_ETHERNET_CONNECTION APP_NETWORK_Init(true); #endif /* Wait for wifi/eth to connect */ while (0 == get_connect_state()) { /* Give time to the network task to connect */ vTaskDelay(1000); } configPRINTF(("TCP server start\r\n")); configPRINTF(("MQTT connection start\r\n")); mqtt_client = mqtt_client_new(); if (mqtt_client == NULL) { configPRINTF(("mqtt_client_new() failed.\r\n");) while (1) { } } if (ipaddr_aton(EXAMPLE_MQTT_SERVER_HOST, &mqtt_addr) && IP_IS_V4(&mqtt_addr)) { /* Already an IP address */ err = ERR_OK; } else { /* Resolve MQTT broker's host name to an IP address */ configPRINTF(("Resolving \"%s\"...\r\n", EXAMPLE_MQTT_SERVER_HOST)); err = netconn_gethostbyname(EXAMPLE_MQTT_SERVER_HOST, &mqtt_addr); configPRINTF(("Resolving status: %d.\r\n", err)); } if (err == ERR_OK) { configPRINTF(("connect to mqtt\r\n")); /* Start connecting to MQTT broker from tcpip_thread */ err = tcpip_callback(connect_to_mqtt, NULL); configPRINTF(("connect status: %d.\r\n", err)); if (err != ERR_OK) { configPRINTF(("Failed to invoke broker connection on the tcpip_thread: %d.\r\n", err)); } } else { configPRINTF(("Failed to obtain IP address: %d.\r\n", err)); } int i=0; /* Publish some messages */ for (i = 0; i < 5;) { configPRINTF(("connect status enter: %d.\r\n", connected)); if (connected) { err = tcpip_callback(publish_message_start, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } i++; } sys_msleep(1000U); } vTaskDelete(NULL); } Please note the following published json data, it can’t be publish directly in the code. {   "reported": {     "LEDstatus": false,     "humid": 88,     "temp": 22   } } Which need to use this web https://www.bejson.com/ realize the json data compression and convert: {\"reported\" : {     \"LEDstatus\" : true,     \"humid\" : 88,     \"temp\" : 11    } }   4)main appTask Under case kCommandGeneric: , if the language is Chinese, then add the corresponding voice recognition control code. "开远程灯": turn on the local yellow light, publish the “remote led on” mqtt message to Baidu cloud, control remote 1060EVK board lights on. "关远程灯": turn on the local white light, publish the “remote led off” mqtt message to Baidu cloud, control the remote 1060EVK board light off. Related operation code: else if (oob_demo_control.ledCmd == LED_REMOTE_ON) { RGB_LED_SetColor(LED_COLOR_YELLOW); vTaskDelay(5000); err_t err = ERR_OK; err = tcpip_callback(publish_message_on, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } } else if (oob_demo_control.ledCmd == LED_REMOTE_OFF) { RGB_LED_SetColor(LED_COLOR_WHITE); vTaskDelay(5000); err_t err = ERR_OK; err = tcpip_callback(publish_message_off, NULL); if (err != ERR_OK) { configPRINTF(("Failed to invoke publishing of a message on the tcpip_thread: %d.\r\n", err)); } } 3.2 MIMXRT1060-EVK code The main function of the MIMXRT1060-EVK code is to configure another client in the cloud, subscribe to the message published by SLN-LOCAL/2-IOT which detect the remote command, and then the LED on the control board is used to test the voice recognition remote control function, this code is based on Ethernet, through the Ethernet port on the board, to achieve network communication, and then use mqtt to connect baidu cloud, and subscribe the message from local2, This enables the reception and execution of the Local2 command. the network code part is similar to SLN-LOCAL2-IOT board network code, the servers, cloud account passwords, etc. are all the same, the main function is to subscribe messages. See the code from attachment RT1060, lwip_mqtt_freertos.c file. When receives data published by the server, it needs to do a data analysis to get the status of the led light and then control it. Normal data from Baidu cloud shadow sent as follows Received 253 bytes from the topic "$baidu/iot/shadow/RT1060BTCDShadow/update/accepted": "{"requestId":"2fc0ca29-63c0-4200-843f-e279e0f019d3","reported":{"LEDstatus":false,"humid":44,"temp":33},"desired":{},"lastUpdatedTime":{"reported":{"LEDstatus":1635240225296,"humid":1635240225296,"temp":1635240225296},"desired":{}},"profileVersion":159}" Then you need to parse the data of LEDstatus from the received data, whether it is false or true. Because the amount of data is small, there is no json-driven parsing here, just pure data parsing, adding the following parsing code to the mqtt_incoming_data_cb function: mqtt_rec_data.mqttindex = mqtt_rec_data.mqttindex + len; if(mqtt_rec_data.mqttindex >= 250) { PRINTF("kerry test \r\n"); PRINTF("idex= %d", mqtt_rec_data.mqttindex); datap = strstr((char*)mqtt_rec_data.mqttrecdata,"LEDstatus"); if(datap != NULL) { if(!strncmp(datap+11,strtrue,4))//char strtrue[]="true"; { GPIO_PinWrite(GPIO1, 3, 1U); //pull high PRINTF("\r\ntrue"); } else if(!strncmp(datap+11,strfalse,5))//char strfalse[]="false"; { GPIO_PinWrite(GPIO1, 3, 0U); //pull low PRINTF("\r\nfalse"); } } mqtt_rec_data.mqttindex =0; It use the strstr search the “LEDstatus“ in the received data, and get the pointer position, then add the fixed length to get the LED status is true or flash. If it is true, turn on the led, if it is false, turn off the led. 4 Test Result    This section gives the test results and video of the system. Before testing the voice function, first use MQTTfx to test baidu cloud connection, release, subscription is no problem, and then test sln-local2-iot combined with mimxrt1060-evk voice wake-up recognition and remote control functions.    For SLN-LOCAL2-IOT wifi hotspot join, enter the command in the print terminal: setup AWS kerry123456   4.1 MQTT.fx test baidu cloud connection MQTT.fx is an EclipsePaho-based MQTT client tool written in the Java language that supports subscription and publishing of messages through Topic.    4.1.1 MQTT fx configuration     Download and install the tool, then open it, at first, need to do the configuration, click edit connection: Pic19 Profile name:connect name Profile type: MQTT broker Broker address: It is the baidu could generated broker address, with 1883 no encryption transfer. Broker port:1883 No encryption Client ID: RT1060BTCDShadow, here need to note, this name should be the same as the could shadow name, otherwise, on the baidu webpage, the connection is not be detected. If this Client ID name is the same as the shadow name, then when the MQTT fx connect, the online side also can see the connection is OK. User credentials: add the thing User name and password from the baidu cloud. After the configuration, click connect, and refresh the website. Before conection: Pic 20 After connection: Pic 21 4.1.2 MQTT fx subscribe When it comes to subscription publishing, what is the topic of publishing subscriptions?  Here you can open your thing shadow, select the interaction, and see that the page has given the corresponding topic situation: Pic 22 Subscribe topic is: $baidu/iot/shadow/RT1060BTCDShadow/update/accepted  Publish topic is: $baidu/iot/shadow/RT1060BTCDShadow/update Pic 23 Click subscribe, we can see it already can used to receive the data.   4.1.3 MQTT fx publish Publish need to input the topic: $baidu/iot/shadow/RT1060BTCDShadow/update It also need to input the content, it will use the json content data. Pic 24 Here, we can use this json data: {   "reported" : {     "LEDstatus" : true,     "humid" : 88,     "temp" : 11    } } The json data also can use the website to check the data: https://www.bejson.com/jsonviewernew/ Pic 25 Input the publish data, and click pubish button: Pic 26 4.1.4 Publish data test result   Before publish, clean the website thing data: Pic 27 MQTT fx publish data, then check the subscribe data and the website situation: Pic 28 We can see, the published data also can be see in the website and the mqttfx subscribe area. Until now, the connection, data transfer test is OK.   4.2 Voice recognition and remote control test This is the device connection picture: Pic 29 4.2.1 voice recognition local control Pic 30 This is the SLN-LOCAL2-IOT print information after recognize the voice WW and VC. Red led on: led cycle: 4.2.2 voice recognition remote control   Following test, wakeup + remote on, wakeup+remote off, and also give the print result and the video. Pic 31 remote control:  
View full article
  RT1176 has two core. Normally, the CM7 core is the main core. When boot up, CM7 will boot up first. Then it will copy CM4’s image to RAM and kick off CM4 core. The details can be found in AN13264. In RT1176, CM4 has 128K ITCM and 128K DTCM. This space is not big. It is enough for CM4 to do some auxiliary work. But sometimes, customer need CM4 to do more. For example, the CM7 may run an ML algorithm and CM4 to deal with USB/ENET and camera. That need more code space than 128K. In this case, CM4 image should be moved to OCRAM1/2 or SDRAM. OCRAM and SDRAM are both connect to a NIC-301 AXI bus arbiter IP. They have similar performance and character. This article will try to use SDRAM because it is more difficult to move to. Before moving, customer should know an important thing. the R/W speed of CM4 accessing OCRAM/SDRAM is slow. Because the CM4 requests data from SDRAM through XB (LPSR domain - AHB protocol) and then through NIC (WAKEUPMIX domain AXI protocol) and the clock limitation is BUS / BUS_LPSR. If both code and data placed in SDRAM the performance will significantly be reduced. SDRAM is accessible only via SYSTEM bus (so, in such case no harward possible). If any other bus masters are accessing the same memory the performance is even more degraded due to arbitration (on XB or NIC). So, user should arrange the whole memory space very well to eliminate access conflict.   Prepare for the work 1.1 Test environment: • SDK: 2.9.1 for i.MX RT1170 • MCUXpresso: 11.4.0 Example: SDK_root\boards\evkmimxrt1170\multicore_examples\hello_world. Set hardware and software Set the board to XIP boot mode by setting SW1 to OFF OFF ON OFF. Import the hello_world example from SDK. Build the project.   Moving to SDRAM for debug option In CM7 project, add SDRAM space in properties->MCU settings->Memory detail   Then in properties->settings->Multicore, change the CM4 master memory region to SDRAM.   Unlike other IDE, MCUXpresso can wake up M4 project thread and download M4 image by IDE itself. it calls implicit. This field tell IDE where to place CM4 image. In properties->settings->Preprocessor, add XIP_BOOT_HEADER_DCD_ENABLE. This is to add DCD to image's head. DCD can be used to program the SDRAM controller for optimal settings, improving the boot performance. Next, switch to CM4 project. Add SDRAM space to memory table, just like what we do in CM7 project.     In properties-> Managed Linker Script, it’s better to announce Heap and stack space in DTCM. It also can be placed in SDRAM, but this should be careful. After that, we can start debugging. You can see that the SDRAM has been filled with CM4 image. Click Resume button, the CM4 project will stop at the beginning of the Main.   Moving to SDRAM for release option To compile the project in release mode, CORE1_IMAGE_COPY_TO_RAM should be added to Defined symbols table. But that is not enough. The CM7 project of SDK doesn’t copy the CM4 image. We must add this to CM7 project. 3.1 Create a new file named incbin.S. This is the code .section .core_m4slave , "ax" @progbits @preinit_array .global dsp_text_image_start .type dsp_text_image_start, %object .align 2 dsp_text_image_start: .incbin "evkmimxrt1170_hello_world_cm4.bin" .global dsp_text_image_end .type dsp_text_image_end, %object dsp_text_image_end: .global dsp_data_image_start .type dsp_data_image_start, %object .align 2 dsp_ncache_image_end: .global dsp_text_image_size .type dsp_text_image_size, %object .align 2 dsp_text_image_size: .int dsp_text_image_end - dsp_text_image_start 3.2 In hello_world_core0.c, add these code #ifdef CORE1_IMAGE_COPY_TO_RAM extern const char dsp_text_image_start[]; extern int dsp_text_image_size; #define CORE1_IMAGE_START ((uint32_t *)dsp_text_image_start) #define CORE1_IMAGE_SIZE ((int32_t)dsp_text_image_size) #endif 3.3 Right click the evkmimxrt1170_hello_world_cm4.axf in Project Explorer window, select Binary Utilities->Create Binary. A binary file called evkmimxrt1170_hello_world_cm4.bin will be created. Copy it to the release folder of CM7 project. If you want this work to be done automatically, you can add the command to properties->settings->Build stepes->Post-build steps   Compile the project and download. Press reset button. After a while, you will see a small led blinking. This led is driven by CM4. Debug the CM4 project on SDRAM only As we know that CM4 can debug alone without CM7 starting. But that is in ITCM and DTCM. Can it also work in SDRAM. Yes, but the original debug script file is not support this function. I attached a new script file. It can initialize SDRAM before downloading CM4 code. Replace the old .scp file with this one, nothing else need to be changed.
View full article
When: TUESDAY, SEPTEMBER 14TH AT 11 AM EST Click here to register today.   Topic of discussion From consumer to industrial devices, a paradigm shift has already begun. Our everyday experiences with smartphones are driving the demand for higher performance, more connectivity, and an exceptional user experience as the cornerstones of the embedded products we use. But how can you make it easier to take your product to the next level? Join NXP and Crank Software to learn why the NXP I.MX RT1170 crossover MCU is the right embedded hardware to create and can help lower development risks and how developing engaging user experiences can easily become part of your development workflow. During this session, you’ll learn: About optimizing power and performance with i.MX RT [1170] Crossover MCUs Just how embedded GUI development can be a collaborative experience between development and design How Storyboard’s Rapid Design and Iteration technology embraces UI design changes during development What integrated capabilities can help leverage the hardware’s full potential How easy it is to develop GUI apps via a live demo of a Storyboard  
View full article
RT600 MCUXpresso JLINK debug QSPI flash 1 Introduction     MIMXRT600-EVK is the NXP official board, which onboard flash is the external octal flash, the octal flash is connected to the RT685 flexSPI portB. In practical usage, the customer board may use other flash types, eg QSPI flash, and connect to the FlexSPI A port. Recently, nxp published one RT600 customer flash application note: https://www.nxp.com/docs/en/application-note/AN13386.pdf This document mainly gives the CMSIS DAP related flash algorithm usage, which modifies the option data to generate the new flash algo for the different flash types. Some customer’s own board may use the RT600 QSPI flash+MCUXPresso+JLINK to debug the application code. Recently, one of the customers find on his own customer board, when they use debugger JLINK associated with the MCUXPresso download code to the RT600 QSPI flash, they meet download issues, but when using the CMSIS DAP as a debugger and the related QSPI cfx file, they can download OK. So this document mainly gives the experience of how to use the RT600, MCUXpresso IDE, and JLINK to download and debug the code which is located in the external QSPI flash. 2 JLINK driver prepare and test   MCUXpresso IDE use the JLINK download, it will call the JLINK driver related script and the flash algorithm, but to RT600, the JLINK driver will use the RT600 EVK flexSPI port B octal flash in default, so, if the customer board changes to other flexSPI port and to QSPI flash, they need to provide the related QSPI flash algorithm and script file, otherwise, even they can find the ARM CM33 core, the download will be still failed. If customers want to use the MCUXpresso IDE and the JLINK, they need to make sure the JLINK driver attached tool can do the external flash operation, eg, erase, read, write successfully at first. Now, give the JLINK driver related tool how to add the RT600 QSPI flash driver and script file. 2.1 JLINK driver install   Download the Segger JLINK driver from the following link: https://www.segger.com/downloads/jlink/JLink_Windows_V754b_x86_64.exe This document will use the jlink v7.54b to test, other version is similar. Install the driver, the default driver install path is: C:\Program Files\SEGGER 2.2 Universal flashloader RT-UFL    RT-UFL v1.0 is a universal flashloader, which uses one .FLM file for all i.MXRT chips, and the different external flash, it is mainly used for the Segger JLINK debugger. RT-UFL v1.0 downoad link: https://github.com/JayHeng/RT-UFL/archive/refs/tags/v1.0.zip    Now, to the RT600 QSPI, give the related flash algo file patch.    Copy the following path file: \RT-UFL-1.0\algo\SEGGER\JLink_Vxxx To the JLINK install path: \SEGGER\JLink Then copy the content in file: RT-UFL-master\test\SEGGER\JLink_Vxxx\Devices\NXP\iMXRT6xx\archive2\evkmimxrt685.JLinkScript To replace the content in: C:\Program Files\SEGGER\JLink\Devices\NXP\iMXRT_UFL\iMXRT6xx_CortexM33.JLinkScript Otherwise, the MCUXpresso IDE debug reset button function will not work. So, need to add the JLINKScript code for ResetTarget, which will reset the external flash. pic1 The RT-UFL provide 3 types download flash algo: MIMXRT600_UFL_L0, MIMXRT600_UFL_L1, MIMXRT600_UFL_L2. Pic 2 _L0 used for the QSPI Flash and Octal Flash(page size 256 Bytes, sector size 4KB), _L1/2 used for the hyper flash(Page size 512 Bytes,Sector size 4KB/64KB). The JLINKDevices.xml content also can get the detail information. Different name will call different .FLM, the .FLM is the flash algorithm file, the source code can be found in RT-UFL v1.0, it will use different option0 option1 to configure the different external memory when the memory chip can support SFDP. <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L0" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_256B_4KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device> <!------------------------> <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L1" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_512B_4KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device> <!------------------------> <Device> <ChipInfo Vendor="NXP" Name="MIMXRT600_UFL_L2" WorkRAMAddr="0x00000000" WorkRAMSize="0x00480000" Core="JLINK_CORE_CORTEX_M33" JLinkScriptFile="Devices/NXP/iMXRT_UFL/iMXRT6xx_CortexM33.JLinkScript" Aliases="MIMXRT633S; MIMXRT685S_M33"/> <FlashBankInfo Name="Octal Flash" BaseAddr="0x08000000" MaxSize="0x08000000" Loader="Devices/NXP/iMXRT_UFL/MIMXRT_FLEXSPI_UFL_512B_64KB.FLM" LoaderType="FLASH_ALGO_TYPE_OPEN" /> </Device> 2.3 JLINK commander test Please note, the device need to select as MIMXRT600_UFL_L0 when using the QSPI flash. Pic 3                                         pic 4 Pic 5 We can find, the JLINK command can realize the external QSPI flash read, erase function. 2.4 Jflash Test Operation steps: Target->connect->production programming Pic 6 We can find, the Jflash also can realize the RT600 external QSPI flash erase and program. Please note, not all the JLINK can support JFLASH, this document is using Segger JLINK plus. 3 MCUXpresso configuration and test MCUXpresso: v11.4.0 SDK_2_10_0_EVK-MIMXRT685 MCUXPresso IDE import the SDK project, eg. Helloworld or led_output. 3.1 QSPI FCB configuration    FCB is located from the flash offset address 0X08000400, which is used for the FlexSPI Nor boot configuration, the detailed content of the FCB can be found from the RT600 user manual Table 997. FlexSPI flash configuration block. Different external Flash, the configuration is different, if need to use the QSPI flash, the FCB should use the QSPI related configuration and its own LUT table.    Modify SDK project flash_config folder flash_config.c and flash_config.h, LUT contains fast read, status read, write enable, sector erase, block erase, page program, erase the whole chip. If the external QSPI flash command is different, the LUT command should be modified by following the flash datasheet mentioned related command. const flexspi_nor_config_t flexspi_config = { .memConfig = { .tag = FLASH_CONFIG_BLOCK_TAG, .version = FLASH_CONFIG_BLOCK_VERSION, .readSampleClksrc=kFlexSPIReadSampleClk_LoopbackInternally, .csHoldTime = 3, .csSetupTime = 3, .columnAddressWidth = 0, .deviceModeCfgEnable = 0, .deviceModeType = 0, .waitTimeCfgCommands = 0, .deviceModeSeq = {.seqNum = 0, .seqId = 0,}, .deviceModeArg = 0, .configCmdEnable = 0, .configModeType = {0}, .configCmdSeqs = {0}, .configCmdArgs = {0}, .controllerMiscOption = (0), .deviceType = 1, .sflashPadType = kSerialFlash_4Pads, .serialClkFreq = kFlexSpiSerialClk_133MHz, .lutCustomSeqEnable = 0, .sflashA1Size = BOARD_FLASH_SIZE, .sflashA2Size = 0, .sflashB1Size = 0, .sflashB2Size = 0, .csPadSettingOverride = 0, .sclkPadSettingOverride = 0, .dataPadSettingOverride = 0, .dqsPadSettingOverride = 0, .timeoutInMs = 0, .commandInterval = 0, .busyOffset = 0, .busyBitPolarity = 0, .lookupTable = { #if 0 [0] = 0x08180403, [1] = 0x00002404, [4] = 0x24040405, [12] = 0x00000604, [20] = 0x081804D8, [36] = 0x08180402, [37] = 0x00002080, [44] = 0x00000460, #endif // Fast Read [4*0+0] = FLEXSPI_LUT_SEQ(CMD_SDR , FLEXSPI_1PAD, 0xEB, RADDR_SDR, FLEXSPI_4PAD, 0x18), [4*0+1] = FLEXSPI_LUT_SEQ(MODE4_SDR, FLEXSPI_4PAD, 0x00, DUMMY_SDR , FLEXSPI_4PAD, 0x09), [4*0+2] = FLEXSPI_LUT_SEQ(READ_SDR , FLEXSPI_4PAD, 0x04, STOP_EXE , FLEXSPI_1PAD, 0x00), //read status [4*1+0] = FLEXSPI_LUT_SEQ(CMD_SDR , FLEXSPI_1PAD, 0x05, READ_SDR, FLEXSPI_1PAD, 0x04), //write Enable [4*3+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x06, STOP_EXE, FLEXSPI_1PAD, 0), // Sector Erase byte LUTs [4*5+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x20, RADDR_SDR, FLEXSPI_1PAD, 0x18), // Block Erase 64Kbyte LUTs [4*8+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0xD8, RADDR_SDR, FLEXSPI_1PAD, 0x18), //Page Program - single mode [4*9+0] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x02, RADDR_SDR, FLEXSPI_1PAD, 0x18), [4*9+1] = FLEXSPI_LUT_SEQ(WRITE_SDR, FLEXSPI_1PAD, 0x04, STOP_EXE, FLEXSPI_1PAD, 0x0), //Erase whole chip [4*11+0]= FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x60, STOP_EXE, FLEXSPI_1PAD, 0), }, }, .pageSize = 0x100, .sectorSize = 0x1000, .ipcmdSerialClkFreq = 1, .isUniformBlockSize = 0, .blockSize = 0x10000, }; This code has been tested on the RT685+ QSPI flash MT25QL128ABA1ESE, the code boot is working. 3.2 Debug configuration Configure the JLINK options in the MCUXpresso IDE as the JLINK driver: JLinkGDBServerCL.exe Windows->preferences Pic 7 Press debug, generate .launch file. Pic 8 Run->Debug configurations           Pic 9 Choose the device as MIMXRT600_UFL_L0, if the SWD wire is long and not stable, also can define the speed as the fixed low frequency. 3.3 Download and debug test Before download, need to check the RT685 ISP mode configuration, as this document is using the 4 wire QSPI and connect to the FlexSPI A port, so the ISP boot mode should be FlexSPI boot from Port A: ISP2 PIO1_17 low, ISP1 PIO1_16 high, ISP0 PIO1_15 high Click debug button, we can see the code enter the debug mode, and enter the main function, the code address is located in the flexSPI remap address. Pic 10 Click run, we can find the RT685 pin P0_26 is toggling, and the UART interface also can printf information. The application code is working. 4 External SPI flash operation checking To the customer designed board, normally we will use the JLINK command to check whether it can find the ARM core or not at first, make sure the RT chip can work, then will check the external flash operation or not. 4.1 SDK IAP flash code test We can use the SDK related code to test the external flash operation or not at first, the SDK code path is: SDK_2_10_0_EVK-MIMXRT685\boards\evkmimxrt685\driver_examples\iap\iap_flash Then, check the external flash, and modify the code’s related option0, option1 to match the external flash. About the option 0 and option1 definition, we can find it from the RT600 user manual Table 1004.Option0 definition and Table 1005.Option1 definition Pic 11 Pic 12 To the external QSPI flash which is connected to the FLexSPI portA, we can modify the option to the following code:     option.option0.U = 0xC0000001;//EXAMPLE_NOR_FLASH;     option.option1.U = 0x00000000;//EXAMPLE_NOR_FLASH_OPTION1; Then burn the IAP_flash project to the RT685 internal RAM, debug to run it. Pic 13 We can find, the external QSPI flash initialization, erase, read and write all works, and the memory also can find the correct data. 4.2 MCUBootUtility test   Chip enter the ISP mode, then use the MCUBootUtility tool to connect the RT685 and QSPI flash, to do the application code program and read test. ISP mode:ISP2:high, ISP1: high ISP0 low Configure FlexSPI NOR Device Configuration as QSPI, we can use the template: ISSI_IS25LPxxxA_IS25WPxxxA. Pic 14 Click connect to ROM button, check whether it can recognize the external flash: Pic 15 After connection, we can use the tool attached RT685 image to download: NXP-MCUBootUtility-3.3.1\apps\NXP_MIMXRT685-EVK_Rev.E\led_blinky_0x08001000_fdcb.srec Pic 16 We can find, the connection, erase, program and read are all work, it also indicates the RT685+external QSPI flash is working. Then can go to debug it with IDE and debugger.    
View full article
Introduction NXP i.MXRT106x has two USB2.0 OTG instance. And the RT1060 EVK has both of the USB interface on the board. But the RT1060 SDK only has single USB host example. Although RT1060’s USB host stack support multiple devices, but we still need a USB HUB when user want to connect two device. This article will show you how to make both USB instance as host. RT1060 SDK has single host examples which support multiple devices, like host_hid_mouse_keyboard_bm. But this application don’t use these examples. Instead, MCUXpresso Config Tools is used to build the demo from beginning. The config tool is a very powerful tool which can configure clock, pin and peripherals, especially the USB. In this application demo, it can save 95% coding work. Hardware and software tools RT1060 EVK MCUXpresso 11.4.0 MIMXRT1060 SDK 2.9.1 Step 1 This project will support USB HID mouse and USB CDC. First, create an empty project named MIMXRT1062_usb_host_dual_port. When select SDK components, select “USB host CDC” and “”USB host HID” in Middleware label. IDE will select other necessary component automatically.     After creating the empty project, clock should be configured first. Both of the USB PHY need 480M clock.   Step 2 Next step is to configure USB host in peripheral config tool. Due to the limitation of config tool, only one host instance of the USB component is allowed. In this project, CDC VCOM is added first.   Step 3 After these settings, click “Update Code” in control bar. This will turn all the configurations into code and merge into project. Then click the “copy to clipboard” button. This will copy the host task call function. Paste it in the forever while loop in the project’s main(). Besides that, it also need to add BOARD_InitBootPeripherals() function call in main(). At this point, USB VCOM is ready. The tool will not only copy the file and configure USB, but also create basic implementation framework. If compile and download the project to RT1060 EVK, it can enumerate a USB CDC VCOM device on USB1. If characters are send from CDC device, the project can send it out to DAPLink UART port so that you can see the character on a terminal interface in computer. Step 4 To get USB HID mouse code, it need to create another USB HID project. The workflow is similar to the first project. Here is the screenshot of the USB HID configuration.   Click “Update code”, the HID mouse code will be generated. The config tool generate two files, usb_host_interface_0_hid_mouse.c and usb_host_interface_0_hid_mouse. Copy them to the “source” folder in dual host project.     Step 5 Next step is to modify some USB macro definitions. <usb_host_config.h> #define USB_HOST_CONFIG_EHCI 2 /*means there are two host instance*/ #define USB_HOST_CONFIG_MAX_HOST 2 /*The USB driver can support two ehci*/ #define USB_HOST_CONFIG_HID (1U) /*for mouse*/ Next step is merge usb_host_app.c. The project initialize USB hardware and software in USB_HostApplicationInit(). usb_status_t USB_HostApplicationInit(void) { usb_status_t status; USB_HostClockInit(kUSB_ControllerEhci0); USB_HostClockInit(kUSB_ControllerEhci1); #if ((defined FSL_FEATURE_SOC_SYSMPU_COUNT) && (FSL_FEATURE_SOC_SYSMPU_COUNT)) SYSMPU_Enable(SYSMPU, 0); #endif /* FSL_FEATURE_SOC_SYSMPU_COUNT */ status = USB_HostInit(kUSB_ControllerEhci0, &g_HostHandle[0], USB_HostEvent); status = USB_HostInit(kUSB_ControllerEhci1, &g_HostHandle[1], USB_HostEvent); /*each usb instance have a g_HostHandle*/ if (status != kStatus_USB_Success) { return status; } else { USB_HostInterface0CicVcomInit(); USB_HostInterface0HidMouseInit(); } USB_HostIsrEnable(); return status; } In USB_HostIsrEnable(), add code to enable USB2 interrupt.   irqNumber = usbHOSTEhciIrq[1]; NVIC_SetPriority((IRQn_Type)irqNumber, USB_HOST_INTERRUPT_PRIORITY); EnableIRQ((IRQn_Type)irqNumber); Then add and modify USB interrupt handler. void USB_OTG1_IRQHandler(void) { USB_HostEhciIsrFunction(g_HostHandle[0]); } void USB_OTG2_IRQHandler(void) { USB_HostEhciIsrFunction(g_HostHandle[1]); } Since both USB instance share the USB stack, When USB event come, all the event will call USB_HostEvent() in usb_host_app.c. HID code should also be merged into this function. static usb_status_t USB_HostEvent(usb_device_handle deviceHandle, usb_host_configuration_handle configurationHandle, uint32_t eventCode) { usb_status_t status1; usb_status_t status2; usb_status_t status = kStatus_USB_Success; /* Used to prevent from multiple processing of one interface; * e.g. when class/subclass/protocol is the same then one interface on a device is processed only by one interface on host */ uint8_t processedInterfaces[USB_HOST_CONFIG_CONFIGURATION_MAX_INTERFACE] = {0}; switch (eventCode & 0x0000FFFFU) { case kUSB_HostEventAttach: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 == kStatus_USB_NotSupported) && (status2 == kStatus_USB_NotSupported)) { status = kStatus_USB_NotSupported; } break; case kUSB_HostEventNotSupported: usb_echo("Device not supported.\r\n"); break; case kUSB_HostEventEnumerationDone: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 != kStatus_USB_Success) && (status2 != kStatus_USB_Success)) { status = kStatus_USB_Error; } break; case kUSB_HostEventDetach: status1 = USB_HostInterface0CicVcomEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); status2 = USB_HostInterface0HidMouseEvent(deviceHandle, configurationHandle, eventCode, processedInterfaces); if ((status1 != kStatus_USB_Success) && (status2 != kStatus_USB_Success)) { status = kStatus_USB_Error; } break; case kUSB_HostEventEnumerationFail: usb_echo("Enumeration failed\r\n"); break; default: break; } return status; } USB_HostTasks() is used to deal with all the USB messages in the main loop. At last, HID work should also be added in this function. void USB_HostTasks(void) { USB_HostTaskFn(g_HostHandle[0]); USB_HostTaskFn(g_HostHandle[1]); USB_HostInterface0CicVcomTask(); USB_HostInterface0HidMouseTask(); }   After all these steps, the dual USB function is ready. User can insert USB mouse and USB CDC device into any of the two USB port simultaneously. Conclusion All the RT/LPC/Kinetis devices with two OTG or HOST can support dual USB host. With the help of MCUXpresso Config Tool, it is easy to implement this function.
View full article
Obtaining the footprint for Kinetis/LPC/i.MXRT part numbers is very straightforward using the Microcontroller Symbols, Footprints and Models Library homepage, on the following link: https://www.nxp.com/design/software/models/microcontroller-symbols-footprints-and-models:MCUCAD?tid=vanMCUCAD What some users may not be aware of is that the BXL file available for NXP Kinetis/LPC/i.MXRT part numbers also contain the 3D model of the package, which is often needed when working on the industrial design of your application. You may follow the steps below to export the 3D model of the package in STEP (Standard for the Exchange of Product Data) format using the Ultra Librarian software, which can be downloaded from the link on the models library homepage. A STEP (.step,stp) file stores the model in ASCII format. This format can be imported into many CAD suites that allow to work with 3D solids. First, obtain the BXL file for the part number you are interested in. In this example the MIMXRT1052CVL5B.blx.   Then, open the Ultra Librarian project and load this file using the “Load Data” button, and select the “3D Step Model” checkbox from the Select Tools options. Finally, select the Export to Select Tools option. Once the exporting process is finished, the step file will be available on the path UltraLibrarian/Library/Exported.  The STEP (.stp) file can be opened in CAD suites that support solid 3D objects, like FreeCAD which is open source.
View full article
The i.MXRT1060 provides tightly coupled GPIOs to be accessed with high frequency. RT1060 provides two sets of GPIOs registers to control pad output. GPIO1 to GPIO3 are general GPIOs, and GPIO6 to GPIO8 are tightly GPIOs, but they share the same pad, which means the gpio pin can select from GPIO1/2/3 or GPIO6/7/8. The registers IOMUXC_GPR_GPR26, IOMUXC_GPR_GPR27, and IOMUXC_GPR_GPR28 are for GPIO selection. To select the gpio pin between GPIO1/2/3 or GPIO6/7/8 you can use MCUXpresso Config Tools. For example, if you select pin G10 you can select either GPIOI_IO11 for normal GPIO or GPIO6_IO11 for fast GPIO.  I made an example based on the SDK v2.7.0 to compare the speed of Normal GPIO and Fast GPIO. For this, I used pin G10 (GPIOI_IO11 and GPIO6_IO11). Firstly, I used the normal GPIO pin (GPIOI_IO11). I will toggle the pin by writing directly to the GPIO_DR register. Notice that you can access this pin through J22 pin 3 in the evaluation board, so you can measure the performance of the pin. Here are the results: With the normal GPIO pin, we reach a period of 160ns when writing directly to the GPIO_DR register. Now, if we change to the fast GPIO and use the same instructions we have the following results. As you can see when using the fast GPIO pin, the period of the signal it's almost one-third of the period when using a normal GPIO. Now, The A1 silicon of the RT1060 has a new GPIO toggle feature. If we toggle the pin with the new register DR_TOGGLE instead of the GPIO_DR we will get better performance with both pins, normal GPIO and fast. Here are the results of the normal GPIO with the DR_TOGGLE register. As you can see when using the register DR_TOGGLE along with the normal GPIO pin we get a period of around 53 ns while when writing to the GPIO_DR register we got 160 ns. When using the register DR_TOGGLE and the fast GPIO we will get the best performance of the pin. Results are shown below. Many thanks to @jorge_a_vazquez for his valuable help with this document. Hope this helps! Best regards, Victor.
View full article
As we know, the RT series MCUs support the XIP (Execute in place) mode and benefit from saving the number of pins, serial NOR Flash is most commonly used, as the FlexSPI module can high efficient fetch the code and data from the Serial NOR flash for Cortex-M7 to execute. The fetch way is implementing via utilizing the Quad IO Fast Read command, meanwhile, the serail NOR flash works in the SDR (Single Data transfer Rate) mode, it receives data on SCLK rise edge and transmits data on SCLK fall edge. Comparing to the SDR mode, the DDR (Dual Data transfer Rate) mode has a higher throughput capacity, whether it can provide better performance of XIP mode, and how to do that if we want the Serial NOR Flash to work in DDR (Dual Data transfer Rate) mode? SDR & DDR mode SDR mode: In SDR (Single Data transfer Rate) mode, data is only clocked on one edge of the clock (either the rising or falling edge). This means that for SDR to have data being transmitted at X Mbps, the clock bit rate needs to be 2X Mbps. DDR mode: For DDR (Dual Data transfer Rate) mode, also known as DTR (Dual Transfer Rate) mode, data is transferred on both the rising and falling edge of the clock. This means data is transmitted at X Mbps only requires the clock bit rate to be X Mbps, hence doubling the bandwidth (as Fig 1 shows).   Fig 1 Enable DDR mode The below steps illustrate how to make the i.MX RT1060 boot from the QSPI with working in DDR mode. Note: The board is MIMXRT1060, IDE is MCUXpresso IDE Open a hello_world as the template Modify the FDCB(Flash Device Configuration Block) a)Set the controllerMiscOption parameter to supports DDR read command. b) Set Serial Flash frequency to 60 MHz. c)Parase the DDR read command into command sequence. The following table shows a template command sequence of DDR Quad IO FAST READ instruction and it's almost matching with the FRQDTR (Fast Read Quad IO DTR) Sequence of IS25WP064 (as Fig 2 shows).   Fig2 FRQDTR Sequence d)Adjust the dummy cycles. The dummy cycles should match with the specific serial clock frequency and the default dummy cycles of the FRQDTR sequence command is 6 (as the below table shows).   However, when the serial clock frequency is 60MHz, the dummy cycle should change to 4 (as the below table shows).   So it needs to configure [P6:P3] bits of the Read Register (as the below table shows) via adding the SET READ PARAMETERS command sequence(as Fig 3 shows) in FDCB manually. Fig 3 SET READ PARAMETERS command sequence In further, in DDR mode, the SCLK cycle is double the serial root clock cycle. The operand value should be set as 2N, 2N-1 or 2*N+1 depending on how the dummy cycles defined in the device datasheet. In the end, we can get an adjusted FCDB like below. // Set Dummy Cycles #define FLASH_DUMMY_CYCLES 8 // Set Read register command sequence's Index in LUT table #define CMD_LUT_SEQ_IDX_SET_READ_PARAM 7 // Read,Read Status,Write Enable command sequences' Index in LUT table #define CMD_LUT_SEQ_IDX_READ 0 #define CMD_LUT_SEQ_IDX_READSTATUS 1 #define CMD_LUT_SEQ_IDX_WRITEENABLE 3 const flexspi_nor_config_t qspiflash_config = { .memConfig = { .tag = FLEXSPI_CFG_BLK_TAG, .version = FLEXSPI_CFG_BLK_VERSION, .readSampleClksrc=kFlexSPIReadSampleClk_LoopbackFromDqsPad, .csHoldTime = 3u, .csSetupTime = 3u, // Enable DDR mode .controllerMiscOption = kFlexSpiMiscOffset_DdrModeEnable | kFlexSpiMiscOffset_SafeConfigFreqEnable, .sflashPadType = kSerialFlash_4Pads, //.serialClkFreq = kFlexSpiSerialClk_100MHz, .serialClkFreq = kFlexSpiSerialClk_60MHz, .sflashA1Size = 8u * 1024u * 1024u, // Enable Flash register configuration .configCmdEnable = 1u, .configModeType[0] = kDeviceConfigCmdType_Generic, .configCmdSeqs[0] = { .seqNum = 1, .seqId = CMD_LUT_SEQ_IDX_SET_READ_PARAM, .reserved = 0, }, .lookupTable = { // Read LUTs [4*CMD_LUT_SEQ_IDX_READ] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0xED, RADDR_DDR, FLEXSPI_4PAD, 0x18), // The MODE8_DDR subsequence costs 2 cycles that is part of the whole dummy cycles [4*CMD_LUT_SEQ_IDX_READ + 1] = FLEXSPI_LUT_SEQ(MODE8_DDR, FLEXSPI_4PAD, 0x00, DUMMY_DDR, FLEXSPI_4PAD, FLASH_DUMMY_CYCLES-2), [4*CMD_LUT_SEQ_IDX_READ + 2] = FLEXSPI_LUT_SEQ(READ_DDR, FLEXSPI_4PAD, 0x04, STOP, FLEXSPI_1PAD, 0x00), // READ STATUS REGISTER [4*CMD_LUT_SEQ_IDX_READSTATUS] = FLEXSPI_LUT_SEQ(CMD_SDR, FLEXSPI_1PAD, 0x05, READ_SDR, FLEXSPI_1PAD, 0x01), [4*CMD_LUT_SEQ_IDX_READSTATUS + 1] = FLEXSPI_LUT_SEQ(STOP, FLEXSPI_1PAD, 0x00, 0, 0, 0), // WRTIE ENABLE [4*CMD_LUT_SEQ_IDX_WRITEENABLE] = FLEXSPI_LUT_SEQ(CMD_SDR,FLEXSPI_1PAD, 0x06, STOP, FLEXSPI_1PAD, 0x00), // Set Read register [4*CMD_LUT_SEQ_IDX_SET_READ_PARAM] = FLEXSPI_LUT_SEQ(CMD_SDR,FLEXSPI_1PAD, 0x63, WRITE_SDR, FLEXSPI_1PAD, 0x01), [4*CMD_LUT_SEQ_IDX_SET_READ_PARAM + 1] = FLEXSPI_LUT_SEQ(STOP,FLEXSPI_1PAD, 0x00, 0, 0, 0), }, }, .pageSize = 256u, .sectorSize = 4u * 1024u, .blockSize = 64u * 1024u, .isUniformBlockSize = false, }; Is DDR mode real better? According to the RT1060's datasheet, the below table illustrates the maximum frequency of FlexSPI operation, as the MIMXRT1060's onboard QSPI flash is IS25WP064AJBLE, it doesn't contain the MQS pin, it means set MCR0.RXCLKsrc=1 (Internal dummy read strobe and loopbacked from DQS) is the most optimized option. operation mode RXCLKsrc=0 RXCLKsrc=1 RXCLKsrc=3 SDR 60 MHz 133 MHz 166 MHz DDR 30 MHz 66 MHz 166 MHz In another word, QSPI can run up to 133 MHz in SDR mode versus 66 MHz in DDR mode. From the perspective of throughput capacity, they're almost the same. It seems like DDR mode is not a better option for IS25WP064AJBLE and the following experiment will validate the assumption. Experiment mbedtls_benchmark I use the mbedtls_benchmark as the first testing demo and I run the demo under the below conditions: 100MH, SDR mode; 133MHz, SDR mode; 66MHz, DDR mode; According to the corresponding printout information (as below shows), I make a table for comparison and I mark the worst performance of implementation items among the above three conditions, just as Fig 4 shows. SDR Mode run at 100 MHz. FlexSPI clock source is 3, FlexSPI Div is 6, PllPfd2Clk is 720000000 mbedTLS version 2.16.6 fsys=600000000 Using following implementations: SHA: DCP HW accelerated AES: DCP HW accelerated AES GCM: Software implementation DES: Software implementation Asymmetric cryptography: Software implementation MD5 : 18139.63 KB/s, 27.10 cycles/byte SHA-1 : 44495.64 KB/s, 12.52 cycles/byte SHA-256 : 47766.54 KB/s, 11.61 cycles/byte SHA-512 : 2190.11 KB/s, 267.88 cycles/byte 3DES : 1263.01 KB/s, 462.49 cycles/byte DES : 2962.18 KB/s, 196.33 cycles/byte AES-CBC-128 : 52883.94 KB/s, 10.45 cycles/byte AES-GCM-128 : 1755.38 KB/s, 329.33 cycles/byte AES-CCM-128 : 2081.99 KB/s, 279.72 cycles/byte CTR_DRBG (NOPR) : 5897.16 KB/s, 98.15 cycles/byte CTR_DRBG (PR) : 4489.58 KB/s, 129.72 cycles/byte HMAC_DRBG SHA-1 (NOPR) : 1297.53 KB/s, 448.03 cycles/byte HMAC_DRBG SHA-1 (PR) : 1205.51 KB/s, 486.04 cycles/byte HMAC_DRBG SHA-256 (NOPR) : 1786.18 KB/s, 327.70 cycles/byte HMAC_DRBG SHA-256 (PR) : 1779.52 KB/s, 328.93 cycles/byte RSA-1024 : 202.33 public/s RSA-1024 : 7.00 private/s DHE-2048 : 0.40 handshake/s DH-2048 : 0.40 handshake/s ECDSA-secp256r1 : 9.00 sign/s ECDSA-secp256r1 : 4.67 verify/s ECDHE-secp256r1 : 5.00 handshake/s ECDH-secp256r1 : 9.33 handshake/s   DDR Mode run at 66 MHz. FlexSPI clock source is 2, FlexSPI Div is 5, PllPfd2Clk is 396000000 mbedTLS version 2.16.6 fsys=600000000 Using following implementations: SHA: DCP HW accelerated AES: DCP HW accelerated AES GCM: Software implementation DES: Software implementation Asymmetric cryptography: Software implementation MD5 : 16047.13 KB/s, 27.12 cycles/byte SHA-1 : 44504.08 KB/s, 12.54 cycles/byte SHA-256 : 47742.88 KB/s, 11.62 cycles/byte SHA-512 : 2187.57 KB/s, 267.18 cycles/byte 3DES : 1262.66 KB/s, 462.59 cycles/byte DES : 2786.81 KB/s, 196.44 cycles/byte AES-CBC-128 : 52807.92 KB/s, 10.47 cycles/byte AES-GCM-128 : 1311.15 KB/s, 446.53 cycles/byte AES-CCM-128 : 2088.84 KB/s, 281.08 cycles/byte CTR_DRBG (NOPR) : 5966.92 KB/s, 97.55 cycles/byte CTR_DRBG (PR) : 4413.15 KB/s, 130.42 cycles/byte HMAC_DRBG SHA-1 (NOPR) : 1291.64 KB/s, 449.47 cycles/byte HMAC_DRBG SHA-1 (PR) : 1202.41 KB/s, 487.05 cycles/byte HMAC_DRBG SHA-256 (NOPR) : 1748.38 KB/s, 328.16 cycles/byte HMAC_DRBG SHA-256 (PR) : 1691.74 KB/s, 329.78 cycles/byte RSA-1024 : 201.67 public/s RSA-1024 : 7.00 private/s DHE-2048 : 0.40 handshake/s DH-2048 : 0.40 handshake/s ECDSA-secp256r1 : 8.67 sign/s ECDSA-secp256r1 : 4.67 verify/s ECDHE-secp256r1 : 4.67 handshake/s ECDH-secp256r1 : 9.00 handshake/s   Fig 4 Performance comparison We can find that most of the implementation items are achieve the worst performance when QSPI works in DDR mode with 66 MHz. Coremark demo The second demo is running the Coremark demo under the above three conditions and the result is illustrated below. SDR Mode run at 100 MHz. FlexSPI clock source is 3, FlexSPI Div is 6, PLL3 PFD0 is 720000000 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391889200 Total time (secs): 16.328717 Iterations/Sec : 2449.671999 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.671999 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   SDR Mode run at 133 MHz. FlexSPI clock source is 3, FlexSPI Div is 4, PLL3 PFD0 is 664615368 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391888682 Total time (secs): 16.328695 Iterations/Sec : 2449.675237 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.675237 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   DDR Mode run at 66 MHz. FlexSPI clock source is 2, FlexSPI Div is 5, PLL3 PFD0 is 396000000 2K performance run parameters for coremark. CoreMark Size : 666 Total ticks : 391890772 Total time (secs): 16.328782 Iterations/Sec : 2449.662173 Iterations : 40000 Compiler version : MCUXpresso IDE v11.3.1 Compiler flags : Optimization most (-O3) Memory location : STACK seedcrc : 0xe9f5 [0]crclist : 0xe714 [0]crcmatrix : 0x1fd7 [0]crcstate : 0x8e3a [0]crcfinal : 0x25b5 Correct operation validated. See readme.txt for run and reporting rules. CoreMark 1.0 : 2449.662173 / MCUXpresso IDE v11.3.1 Optimization most (-O3) / STACK   After comparing the CoreMark scores, it gets the lowest CoreMark score when QSPI works in DDR mode with 66 MHz. However, they're actually pretty close. Through the above two testings, we can get the DDR mode maybe not a better option, at least for the i.MX RT10xx series MCU.
View full article
Introduction Normal Cortex-M core-based MCUs generally have built-in parallel NOR Flash. The parallel NOR Flash is directly hung on the Cortex-M core high-performance AHB bus. If a well-known IDE supports the MCU, it should integrate the corresponding Flash driver algorithm which enables the developer to program and debug the MCU in the IDE. However, the i.MX RT series MCU doesn't contain the internal flash, how do developers debug these MCUs with online XIP (eXecute-In-Place)? Take easy, i.MXRT can support external parallel NOR and serial NOR to run the XIP, benefit from saving the number of pins, serial NOR Flash is most commonly used and FlexSPI supports XIP feature which makes online debug available. The article introduces the mechanism of debugging the external serial NOR flash with the RT MCU and illustrates the steps of modifying the flash driver algorithm of MCUXpresso. CoreSight Technical The i.MX RT series MCU is based on the Cortex-M core and the CoreSight Technical is a new debugging architecture launched by ARM in 2004 and is also a part of the core authorization, supports the debug and trace feature for Cortex-M core-based MCU. CoreSight is very powerful. It contains many debugging components (ie various protocols). The following figure is from the CoreSight Technical Introduction Manual, which shows the connections between various debugging components under the CoreSight architecture. Fig 1 CoreSight Technical This article does not mainly aim to introduce CoreSight technical. Therefore, for CoreSight, we only need to know that it in charge of the main debugging work and the CoreSight can access the system memory and peripheral register from the AMBA bus through the DAP component in real-time, definitely, it includes the code in the external serial Flash. FlexSPI module To implement debugging in serial Flash, the code must be XIP in serial Flash, that is, the CPU must be able to fetch instructions and data from any address in serial Flash in real-time. The serial Flash mentioned in this article generally refers to the 4-wire SPI Interface NOR Flash and the SPI mode can be Single/Dual/Quad/Octal. No matter which SPI mode is, the Flash is essentially serial Flash, and the address lines and data lines are not only shared but also serial. According to conventional knowledge, to implement the XIP, Flash should be a parallel bus interface and hung on AMBA, further, this parallel bus should have independent address lines and data lines, and the width of the address lines correspond to the size of Flash. So why can run XIP in serial Flash with i.MXRT? The answer is the FlexSPI peripheral. Figure 2 is the FlexSPI module block diagram. On the right side of the block diagram is the signal connection between FlexSPI and external serial Flash. The left side is the connection between FlexSPI and the internal bus of the i.MXRT system. There are two types of bus interface: 32bit IPS BUS (manual manipulate the FlexSPI register sends Flash reading and writing commands) and 64bit AHB BUS (FlexSPI translates the AHB access address and automatically sends the corresponding Flash reading and writing commands) which is the key feature enables the XIP available. Fig 2 FlexSPI module In the Reference manual, it lists detailed information about the AHB bus: - AHB RX Buffer implemented to reduce read latency. Total AHB RX Buffer size: 128 * 64 Bits - 16 AHB masters supported with priority for reading access - 4 flexible and configurable buffers in AHB RX Buffer - AHB TX Buffer implemented to buffer all write data from one AHB burst. AHB TX Buffer size: 8 * 64 Bits - All AHB masters share this AHB TX Buffer. No AHB master number limitation for Write Access. In addition, the AHB bus includes the below-enhanced features to optimize the reading of Serial Flash memory. - Cachable and Non-Cachable access - Prefetch Enable/Disable - Burst size: 8/16/32/64 bits - All burst type: SINGLE/INCR/WRAP4/INCR4/WRAP8/INCR8/WRAP16/INCR16 Debugging process of serial Flash Fig 3 illustrates the debugging process of serial Flash with the RT series MCU and in basic, the overview of the debugging process is not complicated. When you click IDE debugging icon, the Flash driver algorithm (executable file) pre-installed in the IDE will be downloaded to the internal FlexRAM of i.MXRT via the debugger firstly. The Flash driver algorithm provides FlexSPI initialization, erase and programming APIs, etc. Next, the debugger caches the application code (binary machine code) in FlexRAM in segments prior to calling the Flash programming API to implement the program work. After completing programming application code (from FlexRAM to Flash), CoreSight will take over the debugging work. At this time, the CPU can access the serial Flash that connects the FlexSPI module through the AHB bus, in another word, CoreSight can control and track code in real-time, and single-step debugging is available too in the IDE. Fig 3 Flash Driver of MCUXpresso IDE The latest version (11.3.1) of MCUXpresso IDE supports all RT series MCU (as the following figure shows), the developer should select a suitable flash driver file to apply to his board (Fig 5). Fig 4 MCUXpresso IDE Fig 5 Flash driver files As mentioned above, the RT series MCUs don't have an internal flash, so they must use an either external parallel or serial NOR. For IDE providers, it's too hard to provide enough flash drivers to fit all external NOR flashes, the workload is huge, so IDEs general provide the flash driver files for mainstream Serial NOR, especially, 4-wire SPI Interface NOR Flash, it means we need to modify or tune the flash driver to fit our specific application. Add new flash driver of MCUXpresso IDE Before start, we should realize that MCUXpresso IDE is different from MDK/IAR. The flash driver algorithms of MDK and IAR are independent of the specific debug tools and they are able to use with all supported debug tools (JLink/DAPLink, etc). For MCUXpresso IDE, the flash driver algorithms are only able to use with the CMSIS-DAP type debug tool. For instance, when you use JLink with MCUXPresso IDE, it will use the flash driver algorithm of Jlink instead of its own. There's a real case from a customer: He currently designs his new card reader module based on RT1024 and he plans to make a board without external RAM and Flash. In other words, he only utilizes the internal 4MB flash and 256KB FlexRAM which consist of SRAM_DTC(64KB), SRAM_ITC(64KB), SRAM_OC(128KB). So he wants to configure the 256KB RAM area as normal 256KB RAM without being allocated to ITCM and DTCM. He follows the thread to reconfigure the FlexRAM, but he still encounters the below problem (as Fig 6 shows ) when entering debug mode. Fig 6 According to the debug failure log, we can come to a conclusion that the flash drive file: MIMXRT1020.cfx needs to be updated, and the following steps illustrate how to do it. a) Select a source project There are some flash driver projects in the Examples/Flashdrivers/NXP subdirectory within the MCUXpresso IDE installation directory (as Fig 7 shows) and iMXRT folder contains some flash driver projects for external flash parts that work with the RT series MCU (as Fig 8 shows). Fig 7 Fig 8 Select the flash driver project which is the closest to the target as a prototype, in this case, we select the iMXRT1020_QSPI project, extract the project file and import them in the MCUXpresso IDE (as Fig 9). Fig 9 b) Modify pin assignment The RT1024 integrates a 4 MB QSPI flash as an "internal flash", it is connected to different FlexSPI pins versus to the default pins of the iMXRT1020_QSPI project just as the below table shows. FlexSPI pin RT1020 RT1024 FLEXSPI_A_DQS GPIO_SD_B1_05 GPIO_SD_B1_05 FLEXSPI_A_SS0_B GPIO_SD_B1_11 GPIO_AD_B1_05 FLEXSPI_A_SCLK GPIO_SD_B1_07 GPIO_AD_B1_01 FLEXSPI_A_DATA0 GPIO_SD_B1_08 GPIO_AD_B1_02 FLEXSPI_A_DATA1 GPIO_SD_B1_10 GPIO_AD_B1_04 FLEXSPI_A_DATA2 GPIO_SD_B1_09 GPIO_AD_B1_03 FLEXSPI_A_DATA3 GPIO_SD_B1_06 GPIO_AD_B1_00 So it needs to adjust the pin initialization in the BOARD_InitPins() function in pin_mux.c. /* FUNCTION ************************************************************************************************************ * * Function Name : BOARD_InitPins * Description : Configures pin routing and optionally pin electrical features. * * END ****************************************************************************************************************/ void BOARD_InitPins(void) { CLOCK_EnableClock(kCLOCK_Iomuxc); /* iomuxc clock (iomuxc_clk_enable): 0x03u */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B0_06_LPUART1_TX, /* GPIO_AD_B0_06 is configured as LPUART1_TX */ 0U); /* Software Input On Field: Input Path is determined by functionality */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B0_07_LPUART1_RX, /* GPIO_AD_B0_07 is configured as LPUART1_RX */ 0U); /* Software Input On Field: Input Path is determined by functionality */ IOMUXC_SetPinMux( IOMUXC_GPIO_SD_B1_05_FLEXSPI_A_DQS, /* GPIO_SD_B1_05 is configured as FLEXSPI_A_DQS */ 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_05 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_06_FLEXSPI_A_DATA03, /* GPIO_SD_B1_06 is configured as FLEXSPI_A_DATA03 */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_06 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_07_FLEXSPI_A_SCLK, /* GPIO_SD_B1_07 is configured as FLEXSPI_A_SCLK */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_07 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_08_FLEXSPI_A_DATA00, /* GPIO_SD_B1_08 is configured as FLEXSPI_A_DATA00 */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_08 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_09_FLEXSPI_A_DATA02, /* GPIO_SD_B1_09 is configured as FLEXSPI_A_DATA02 */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_09 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_10_FLEXSPI_A_DATA01, /* GPIO_SD_B1_10 is configured as FLEXSPI_A_DATA01 */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_10 */ // IOMUXC_SetPinMux( // IOMUXC_GPIO_SD_B1_11_FLEXSPI_A_SS0_B, /* GPIO_SD_B1_11 is configured as FLEXSPI_A_SS0_B */ // 1U); /* Software Input On Field: Force input path of pad GPIO_SD_B1_11 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_00_FLEXSPI_A_DATA03, /* GPIO_AD_B1_00 is configured as FLEXSPI_A_DATA03 */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_00 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_01_FLEXSPI_A_SCLK, /* GPIO_AD_B1_01 is configured as FLEXSPI_A_SCLK */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_01 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_02_FLEXSPI_A_DATA00, /* GPIO_AD_B1_02 is configured as FLEXSPI_A_DATA00 */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_02 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_03_FLEXSPI_A_DATA02, /* GPIO_AD_B1_03 is configured as FLEXSPI_A_DATA02 */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_03 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_04_FLEXSPI_A_DATA01, /* GPIO_AD_B1_04 is configured as FLEXSPI_A_DATA01 */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_04 */ IOMUXC_SetPinMux( IOMUXC_GPIO_AD_B1_05_FLEXSPI_A_SS0_B, /* GPIO_AD_B1_05 is configured as FLEXSPI_A_SS0_B */ 1U); /* Software Input On Field: Force input path of pad GPIO_AD_B1_05 */ IOMUXC_SetPinConfig( IOMUXC_GPIO_AD_B0_06_LPUART1_TX, /* GPIO_AD_B0_06 PAD functional properties : */ 0x10B0u); /* Slew Rate Field: Slow Slew Rate Drive Strength Field: R0/6 Speed Field: medium(100MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_AD_B0_07_LPUART1_RX, /* GPIO_AD_B0_07 PAD functional properties : */ 0x10B0u); /* Slew Rate Field: Slow Slew Rate Drive Strength Field: R0/6 Speed Field: medium(100MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_05_FLEXSPI_A_DQS, /* GPIO_SD_B1_05 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_06_FLEXSPI_A_DATA03, /* GPIO_SD_B1_06 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_07_FLEXSPI_A_SCLK, /* GPIO_SD_B1_07 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_08_FLEXSPI_A_DATA00, /* GPIO_SD_B1_08 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_09_FLEXSPI_A_DATA02, /* GPIO_SD_B1_09 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_10_FLEXSPI_A_DATA01, /* GPIO_SD_B1_10 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ IOMUXC_SetPinConfig( IOMUXC_GPIO_SD_B1_11_FLEXSPI_A_SS0_B, /* GPIO_SD_B1_11 PAD functional properties : */ 0x10F1u); /* Slew Rate Field: Fast Slew Rate Drive Strength Field: R0/6 Speed Field: max(200MHz) Open Drain Enable Field: Open Drain Disabled Pull / Keep Enable Field: Pull/Keeper Enabled Pull / Keep Select Field: Keeper Pull Up / Down Config. Field: 100K Ohm Pull Down Hyst. Enable Field: Hysteresis Disabled */ } c) Modify linker file According to Fig 3, a flash driver should be downloaded into FlexRAM on the target MCU during the debuggingprocess, for the iMXRT1020_QSPI project, the flash driver needs to be downloaded to DTCM (0x2000_0000~0x2001_0000), however, to meet the customer's demand, the whole of FlexRAM is reconfigured to SRAM_OC in the ResetISR() function. In another word, there's no DTCM area to load the flash driver and it causes the above debug failure. So we need to use the SRAM_OC instead of DTCM to load the flash driver just like the below shows. In the FlashDriver_32Kbuffer.ld of iMXRT1020_QSPI project: /* * Linker script for NXP LPC546xx SPIFI Flash Driver (Messaged) */ MEMORY { /*SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = (64 * 1024)*/ SRAM (rwx) : ORIGIN = 0x20200000, LENGTH = (64 * 1024) } /* stack size : multiple of 8*/ __stack_size = (4 * 1024); /* flash image buffer size : multiple of page size*/ __cache_size = (32 * 1024); /* Supported operations bit map * 0x40 = New device info available after Init() call * This setting must match the actual target flash driver build! */ __opmap_val = 0x1000; /* Actual placement of flash driver code/data controlled via standard file */ INCLUDE "../../LPCXFlashDriverLib/linker/placement.ld" d) Recompile In the LPCXFlashDriverLib project, select the Release_SectorHashing option prior to clicking the Build icon to generate libLPCXFlashDriverLib.a file (as Fig 10 shows). Fig 10 Next, in the iMXRT1020_QSPI project, select the MIMXRT1020-EVK_IS25LP064 option (as Fig 11 shows), then click the Build icon to generate a new flash driver file that resides in ~\Examples\Flashdrivers\NXP\iMXRT\iMXRT1020_QSPI\iMXRT1020_QSPI\builds directory. Fig 11 Note: I've attached a test project which is based on the hello_world demo that comes from the RT1024's SDK library, in addition, the attachment also contains the new flash driver and corresponding debug script files, so please give it a try.  
View full article
i.MXRT1170 crossover MCUs are a new generation product in the RT family of NXP. It has 1 GHz speed and rich on-chip peripherals. Among RT1170 sub-family, RT1173/RT1175/RT1176 have dual core. One cortex-M7 core runs in 1 GHz, and one cortex-M4 core runs in 400 MHz. The two cores can be debugged through one SWD port. In MIMXRT1170-EVK,the Freelink debug interface default use CMSIS-DAP as debug probe. When debug two core project, for example the evkmimxrt1170_hello_world_cm7 project and evkmimxrt1170_hello_world_cm4 project, just click the debug button in CM7 project. After CM7 project become debug status, CM4 project start to debug automatically. But if developer want to use jlink as debug probe, he will find the CM4 project will not start automatically. If he start CM4 project debugging manually, it will fail. Can jlink debug dual core simultaneously? Yes, it can. In order to debug dual core by jlink, there are some additional settings need to be done. IDE and SDK MCUXpresso IDE 11.3, MIMXRT1170-EVK SDK 2.9.1, Jlink probe version 9 or above or change Freelink application firmware to jlink, Segger jlink firmware JLink_Windows_V698a. Import SDK example, here we select multicore_examples/evkmimxrt1170_hello_world_cm7. MCUXpresso IDE can import both CM4 and CM7 project automatically. Compile both project. Debug the CM7 project first. Then switch to CM4 project and also click the debug button. The CM4 project will not debug properly. So, we exit debug. With this step, the IDE created two deug configurations in RUN->Debug Configurations. Click the evkmimxrt1170_hello_world_cm4 JLink Debug, click JLink Debugger label, Add evkmimxrt1170_connect_cm4_cm4side.jlinkscript. Then unselect the “Attach to a running target” checkbox.   Set a breakpoint at start of main() function of the CM4 project. This is because some time the IDE can’t suspend at start of main() when start debugging. A second breakpoint can be helpful. Take care to set the break point on BOARD_ConfigMPU() or below code. Don’t set break point on “gpio_pin_config_t led_config…”. Otherwise, debug will fail. Now we can start to debug CM7 project. Click the debug button in RUN-> evkmimxrt1170_hello_world_cm4 JLink Debug. This is because the IDE will enable “attach to a running target” automatically. We must disable it again. When CM7 debug circumstance is ready, switch to CM4 project and click “debug” button. Then resume the CM7 project. The CM4 project will start debugging and suspend at the breakpoint.   Notes: If you follow this guide but still can’t debug both core, please try to erase whole chip and try again. If CM7 project run fails in MCMGR_INIT(), please check the Boot Configure pin. It should be set to Internal Boot mode.
View full article
Realize a panoramic video layer with OpenGL
View full article
RT10xx SAI basic and SDCard wave file play 1. Introduction NXP RT10xx's audio modules are SAI, SPDIF, and MQS. The SAI module is a synchronous serial interface for audio data transmission. SPDIF is a stereo transceiver that can receive and send digital audio, MQS is used to convert I2S audio data from SAI3 to PWM, and then can drive external speakers, but in practical usage, it still need to add the amplifier drive circuit. When we use the SAI module, it will be related to the audio file play and the data obtained. This article will be based on the MIMXRT1060-EVK board, give the RT10xx SAI module basic knowledge, PCM waveform format, the audio file cut, and conversion tool, use the MCUXpresso IDE CFG peripheral tool to create the SAI project, play the audio data, it will also provide the SDcard with fatfs system to read the wave file and play it. 2. Basic Knowledge and the tools Before entering the project details and testing, just provide some SAI module knowledge, wave file format information, audio convert tools. 2.1 SAI module basic RT10xx SAI module can support I2S, AC97, TDM, and codec/DSP interface. SAI module contains Transmitter and Receiver, the related signals:     SAI_MCLK: master clock, used to generate the bit clock, master output, slave input.     SAI_TX_BCLK: Transmit bit clock, master output, slave input     SAI_TX_SYNC: Transmit Frame sync, master output, slave input, L/R channel select     SAI_TX_DATA[4]:Transmit data line, 1-3 share with RX_DATA[1-3]     SAI_RX_BCLK: receiver bit clock     SAI_RX_SYNC: receiver frame sync     SAI_RX_DATA[4]: receiver data line SAI module clocks: audio master clock, bus clock, bit clock SAI module Frame sync has 3 modes:      1)Transmit and receive using its own BCLK and SYNC      2)Transmit async, receive sync: use transmit BCLK and SYNC, transmit enable at first, disable at last.      3)Transmit sync, receive async: use receive BCLK and SYNC, receiver enable at first, disable at last. Valid frame sync is also ignored (slave mode) or not generated (master mode) for the first four-bit clock cycles after enabling the transmitter or receiver. Pic 1 SAI module clock structure: Pic 2 SAI module 3 clock sources:  PLL3_PFD3, PLL5, PLL4 In the above picture, SAI1_CLK_ROOT, which can be used as the MCLK, the BCLK is: BCLK= master clock/(TCR2[DIV]+1)*2 Sample rate = Bitclockfreq /(bitwidth*channel) 2.2 waveform audio file format WAVE file is used to save the PCM encode data, WAVE is using the RIFF format, the smallest unit in the RIFF file is the CK struct, CKID is the data type, the value can be: “RIFF”,“LIST”,“fmt”, “data” etc. RIFF file is little-endian. RIFF structure: typedef unsigned long DWORD;//4B typedef unsigned char BYTE;//1B typedef DWORD FOURCC; // 4B typedef struct { FOURCC ckID; //4B DWORD ckSize; //4B union { FOURCC fccType; // RIFF form type 4B BYTE ckData[ckSize]; //ckSize*1B } ckData; } RIFFCK; Pic 3 Take a 16Khz 2 channel wave file as the example: Pic 4 Yellow: CKID  Green: data length   Purple: data The detailed analysis as follows: Pic 5 We can find, the real audio data, except the wave header, the data size is 1279860bytes. 2.3 Audio file convert In practical usage, the audio file may not the required channel and the sample rate configuration, or the format is not the wave, or the time is too long, then we can use some tool to convert it to your desired format. We can use the ffmpeg tool: https://ffmpeg.org/ About the details, check the ffmpeg document, normally we use these command: mp3 file converts to 16k, 16bit, 2 channel wave file: ffmpeg -i test.mp3 -acodec pcm_s16le -ar 16000 -ac 2 test.wav or: ffmpeg -i test.mp3 -aq 16 -ar 16000 -ac 2 test.wav test.wav, cut 35s from 00:00:00, and can convert save to test1.wav: ffmpeg -ss 00:00:00 -i test.wav -t 35.0 -c copy test1.wav Pic 6 Pic 7 2.4 Obtain wave L/R channel audio data Just like the SDK code, save the L/R audio data directly in the RT RAM array, so here, we need to obtain the audio data from the wav file. We can use the python readout the wav header, then get the audio data size, and save the audio data to one array in the .h files. The related Python code can be: import sys import wave def wav2hex(strWav, strHex): with wave.open(strWav, "rb") as fWav: wavChannels = fWav.getnchannels() wavSampleWidth = fWav.getsampwidth() wavFrameRate = fWav.getframerate() wavFrameNum = fWav.getnframes() wavFrames = fWav.readframes(wavFrameNum) wavDuration = wavFrameNum / wavFrameRate wafFramebytes = wavFrameNum * wavChannels * wavSampleWidth print("Channels: {}".format(wavChannels)) print("Sample width: {}bits".format(wavSampleWidth * 8)) print("Sample rate: {}kHz".format(wavFrameRate/1000)) print("Frames number: {}".format(wavFrameNum)) print("Duration: {}s".format(wavDuration)) print("Frames bytes: {}".format(wafFramebytes)) fWav.close() pass with open(strHex, "w") as fHex: # Print WAV parameters fHex.write("/*\n"); fHex.write(" Channels: {}\n".format(wavChannels)) fHex.write(" Sample width: {}bits\n".format(wavSampleWidth * 8)) fHex.write(" Sample rate: {}kHz\n".format(wavFrameRate/1000)) fHex.write(" Frames number: {}\n".format(wavFrameNum)) fHex.write(" Duration: {}s\n".format(wavDuration)) fHex.write(" Frames bytes: {}\n".format(wafFramebytes)) fHex.write("*/\n\n") # Print WAV frames fHex.write("uint8_t music[] = {\n") print("Transferring...") i = 0 while wafFramebytes > 0: if(wafFramebytes < 16): BytesToPrint = wafFramebytes else: BytesToPrint = 16 fHex.write(" ") for j in range(0, BytesToPrint): if j != 0: fHex.write(' ') fHex.write("0x{:0>2x},".format(wavFrames[i])) i+=1 j+=1 fHex.write("\n") wafFramebytes -= BytesToPrint fHex.write("};\n") fHex.close() print("Done!") wav2hex(sys.argv[1], sys.argv[2]) Take the music1.wave as an example: Pic 8 2.4 Audio data relationship with audio wave 16bit data range is: -32768 to 32767, the goldwave related value range is(-1~1).Use goldwave tool to open the example music1.wav, check the data in 1s position, the left channel relative data is -0.08227, right channel relative data is -0.2257. Pic 9                                                                          pic 10 Now, calculate the L/R real data, and find the position in the music1.h. Pic 11 From pic 8, we can know, the real wave R/L data from line 11, each line contains 16 bytes of data. So, from music1.wav related value, we can calculate the related data, and compare it with the real data in the array, we can find, it is totally the same. 3. SAI MCUXpresso project creation Based on SDK_2.9.2_EVK-MIMXRT1060, create one SAI DMA audio play project. The audio data can use the above music1.h. Create one bare-metal project: Drivers check: clock, common, dmamux, edma,gpio,i2c,iomuxc,lpuart,sai,sai_edma,xip_device Utilities check:       Debug_console,lpuart_adapter,serial_manager,serial_manager_uart Board components check:       Xip_board Abstraction Layer check:       Codec, codec_wm8960_adapter,lpi2c_adapter Software Components check:       Codec_i2c,lists,wm8960 After the creation of the project, open the clocks, configure the clock, core, flexSPI can use the default one, we mainly configure the SAI1 related clocks: Pic 12 Select the SAI1 clock source as PLL4, PLL4_MAIN_CLK configure as 786.48MHz. SAI1 clock configure as 6.144375MHz. After the configuration, update the code. Open Pins tool, configure the SAI1 related pins, as the codec also need the I2C, so it contains the I2C pin configuration. Pic 13 Update the code. Open peripherals, configure DMA, SAI, NVIC. Pic 14 Pic 15 DMA配置如下: pic16 After configuration, generate the code. In the above configuration, we have finished the SAI DMA transfer configuration, SAI master mode, 16bits, the sample rate is 16kHz, 2channel, DMA transfer, bit clock is 512Khz, the master clock is 6.1443Mhz. void callback(I2S_Type *base, sai_edma_handle_t *handle, status_t status, void *userData) { if (kStatus_SAI_RxError == status) { } else { finishIndex++; emptyBlock++; /* Judge whether the music array is completely transfered. */ if (MUSIC_LEN / BUFFER_SIZE == finishIndex) { isFinished = true; finishIndex = 0; emptyBlock = BUFFER_NUM; tx_index = 0; cpy_index = 0; } } } int main(void) { sai_transfer_t xfer; /* Init board hardware. */ BOARD_ConfigMPU(); BOARD_InitBootPins(); BOARD_InitBootClocks(); BOARD_InitBootPeripherals(); #ifndef BOARD_INIT_DEBUG_CONSOLE_PERIPHERAL /* Init FSL debug console. */ BOARD_InitDebugConsole(); #endif PRINTF(" SAI wav module test!\n\r"); /* Use default setting to init codec */ if (CODEC_Init(&codecHandle, &boardCodecConfig) != kStatus_Success) { assert(false); } /* delay for codec output stable */ DelayMS(DEMO_CODEC_INIT_DELAY_MS); CODEC_SetVolume(&codecHandle,2U,50); // set 50% volume EnableIRQ(DEMO_SAI_IRQ); SAI_TxEnableInterrupts(DEMO_SAI, kSAI_FIFOErrorInterruptEnable); PRINTF(" MUSIC PLAY Start!\n\r"); while (1) { PRINTF(" MUSIC PLAY Again\n\r"); isFinished = false; while (!isFinished) { if ((emptyBlock > 0U) && (cpy_index < MUSIC_LEN / BUFFER_SIZE)) { /* Fill in the buffers. */ memcpy((uint8_t *)&buffer[BUFFER_SIZE * (cpy_index % BUFFER_NUM)], (uint8_t *)&music[cpy_index * BUFFER_SIZE], sizeof(uint8_t) * BUFFER_SIZE); emptyBlock--; cpy_index++; } if (emptyBlock < BUFFER_NUM) { /* xfer structure */ xfer.data = (uint8_t *)&buffer[BUFFER_SIZE * (tx_index % BUFFER_NUM)]; xfer.dataSize = BUFFER_SIZE; /* Wait for available queue. */ if (kStatus_Success == SAI_TransferSendEDMA(DEMO_SAI, &SAI1_SAI_Tx_eDMA_Handle, &xfer)) { tx_index++; } } } } }   4. SAI test result     To check the real L/R data sendout situation, we modify the music array first 16 bytes data as: 0x55,0xaa,0x01,0x00,0x02,0x00,0x03,0x00,0x04,0x00,0x05,0x00,0x06,0x00,0x07,0x00 Then test SAI_MCLK,SAI_TX_BCLK,SAI_TX_SYNC,SAI_TXD pin wave, and compare with the defined data, because the polarity is configured as active low, it is falling edge output, sample at rising edge. The test point on the MIMXRT1060-EVK board is using the codec pin position: Pic 17 4.1 Logic Analyzer tool wave Pic 18 MCLK clock frequency is 6.144375Mhz, BCLK is 512KHz, SYNC is 16KHz. Pic 19 The first frame data is:1010101001010101 0000000000000001 0XAA55  0X0001 It is the same as the array defined L/R data. SYNC low is Left 16 bit, High is right 16 bit. 4.2 Oscilloscope test wave Just like the logic analyzer, the oscilloscope wave is the same: Pic 20 Add the music.h to the project, and let the main code play the music array data in loop, we will hear the music clear when insert the headphone to on board J12 or add a speaker. 5. SAI SDcard wave music play This part will add the sd card, fatfs system, to read out the 16bit 16K 2ch wave file in the sd card, and play it in loop. 5.1 driver add     Code is based on SDK_2.9.2_EVK-MIMXRT1060, just on the previous project, add the sdcard, sd fatfs driver, now the bare-metal driver situation is: Drivers check: cache, clock, common, dmamux, edma,gpio,i2c,iomuxc,lpuart,sai,sai_edma,sdhc, xip_device Utilities check:       Debug_console,lpuart_adapter,serial_manager,serial_manager_uart Middleware check:       File System->FAT File System->fatfs+sd, Memories Board components check:       Xip_board Abstraction Layer check:       Codec, codec_wm8960_adapter,lpi2c_adapter Software Components check:       Codec_i2c,lists,wm8960 5.2 WAVE header analyzer with code    From previous content, we can know the wav header structure, we need to play the wave file from the sd card, then we need to analyze the wave header to get the audio format, audio data-related information. The header analysis code is: uint8_t Fun_Wave_Header_Analyzer(void) { char * datap; uint8_t ErrFlag = 0; datap = strstr((char*)Wav_HDBuffer,"RIFF"); if(datap != NULL) { wav_header.chunk_size = ((uint32_t)*(Wav_HDBuffer+4)) + (((uint32_t)*(Wav_HDBuffer + 5)) << + (((uint32_t)*(Wav_HDBuffer + 6)) << 16) +(((uint32_t)*(Wav_HDBuffer + 7)) << 24); movecnt += 8; } else { ErrFlag = 1; return ErrFlag; } datap = strstr((char*)(Wav_HDBuffer+movecnt),"WAVEfmt"); if(datap != NULL) { movecnt += 8; wav_header.fmtchunk_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); wav_header.audio_format = ((uint16_t)*(Wav_HDBuffer+movecnt+4) + (uint16_t)*(Wav_HDBuffer+movecnt+5)); wav_header.num_channels = ((uint16_t)*(Wav_HDBuffer+movecnt+6) + (uint16_t)*(Wav_HDBuffer+movecnt+7)); wav_header.sample_rate = ((uint32_t)*(Wav_HDBuffer+movecnt+8)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 9)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 10)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 11)) << 24); wav_header.byte_rate = ((uint32_t)*(Wav_HDBuffer+movecnt+12)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 13)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 14)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 15)) << 24); wav_header.block_align = ((uint16_t)*(Wav_HDBuffer+movecnt+16) + (uint16_t)*(Wav_HDBuffer+movecnt+17)); wav_header.bps = ((uint16_t)*(Wav_HDBuffer+movecnt+18) + (uint16_t)*(Wav_HDBuffer+movecnt+19)); movecnt +=(4+wav_header.fmtchunk_size); } else { ErrFlag = 1; return ErrFlag; } datap = strstr((char*)(Wav_HDBuffer+movecnt),"LIST"); if(datap != NULL) { movecnt += 4; wav_header.list_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); movecnt +=(4+wav_header.list_size); } //LIST not Must datap = strstr((char*)(Wav_HDBuffer+movecnt),"data"); if(datap != NULL) { movecnt += 4; wav_header.datachunk_size = ((uint32_t)*(Wav_HDBuffer+movecnt+0)) + (((uint32_t)*(Wav_HDBuffer +movecnt+ 1)) << + (((uint32_t)*(Wav_HDBuffer +movecnt+ 2)) << 16) +(((uint32_t)*(Wav_HDBuffer +movecnt+ 3)) << 24); movecnt += 4; ErrFlag = 0; } else { ErrFlag = 1; return ErrFlag; } PRINTF("Wave audio format is %d\r\n",wav_header.audio_format); PRINTF("Wave audio channel number is %d\r\n",wav_header.num_channels); PRINTF("Wave audio sample rate is %d\r\n",wav_header.sample_rate); PRINTF("Wave audio byte rate is %d\r\n",wav_header.byte_rate); PRINTF("Wave audio block align is %d\r\n",wav_header.block_align); PRINTF("Wave audio bit per sample is %d\r\n",wav_header.bps); PRINTF("Wave audio data size is %d\r\n",wav_header.datachunk_size); return ErrFlag; } Mainly divide RIFF to 4 parts: “RIFF”,“fmt”,“LIST”,“data”. The 4 bytes data follows the “data” is the whole audio data size, it can be used to the fatfs to read the audio data. The above code also recodes the data position, then when using the fatfs read the wave, we can jump to the data area directly. 5.3 SD card wave data play     Define the array audioBuff[4* 512], used to read out the sd card wave file, and use these data send to the SAI EDMA and transfer it to the I2S interface until all the data is transmitted to the I2S interface.     Callback record each 512 bytes data send out finished, and judge the transmit data size is reached the whole wave audio data size. 5.4 sd card wave play result    Prepare one wave file, 16bit 16k sample rate, 2 channel file, named as music.wav, put in the sd card which already does the fat32 format, insert it to the MIMXRT1060-EVK J39, run the code, will get the printf information: Please insert a card into the board. Card inserted. Make file system......The time may be long if the card capacity is big. SAI wav module test! MUSIC PLAY Start! Wave audio format is 1 Wave audio channel number is 2 Wave audio sample rate is 16000 Wave audio byte rate is 64000 Wave audio block align is 4 Wave audio bit per sample is 16 Wave audio data size is 2728440 Playback is begin! Playback is finished! At the same time, after inserting the headphone or the speaker into the J12, we can hear the music. Attachment is the mcuxpresso10.3.0 and the wave samples.  
View full article
There is an issue with the DCD file used in the SDK 2.9.0 release for the i.MX RT1170 processor. When the included DCD file is used in a project to configure the SDRAM memory on the EVK, the refresh for the memory is not enabled. This can lead to corruption/data loss over time.   To fix the problem, replace the dcd.c file in your project with the attached file instead.   We are working on a fix, and a new revision of the SDK will be released soon.
View full article
Symptoms   Some of us may have experienced the issue that when we put the heap to DTCM, everything is OK. That’s the default settings for MCUXpresso SDK demos. But when we put the heap on cached memory like OCRAM or SDRAM, much of the middleware does not function correctly. This issue happens on USB stack, LwIP and SDcard. USB enumeration failed,  ethernet drop packets, the application no longer writes to SD card, system hanging indefinitely on uninitialized semaphores…   Diagnosis   To understanding this issue, we need to understand the i.MXRT L1 Cache. AN12042 describes the technology of the i.MXRT cache system.       The i.MXRT series implement a CPU core platform described in Figure1. The L1 I/D-Cache is embedded in the core platform. The data cache is 4-way set-associate and instruction cache is 2-way set-associate with cache line size of 32 bytes. It connects with the SIM_M7 bus fabric master port by AXI bus. The subsystem of internal/external memory like OCRAM(FlexRAM banks configured as OCRAM), FlexSPI (Serial NOR, NAND Flash and Hyper Flash/RAM etc) and SEMC(SDRAM, PNOR Flash, NAND Flash etc.) are connected to the bus fabric slave port. CPU core access the subsystem through this bus fabric by L1 cache. The ITCM/DTCM is accessed directly by CPU core, bypass the L1 cache.   OCRAM and SDRAM is cacheable by default.  The cache brings a great performance boost, but the user must pay attention to the cache maintenance for data coherency.  To avoid data coherency issue, the easiest way is to use non-cacheable buffers.  DTCM/ITCM is Tightly-Coupled Memories, core can access it directly (cache is not involved). That can explain why all SDK demos work correctly by default.   Solution   Put critical code and data into TCM, it is non-cacheable, which is the fastest way for CPU to access the code and data.  But forcing all global data into 128KB DTCM is constraining in many cases. Users can split a non-cache memory region from OCRAM or SDRAM, and put the buffers into this region by the linker of toolchain.   Next I will take evkmimxrt1060_host_msd_command_freertos demo for example to illustrate how to make USB HOST stack to run on OCRAM.  MCUxpresso IDE 11.2.1 is used for this demo.  1    Buffer definition  In USB stack, some important data structures are defined with macros USB_GLOBAL, USB_DMA_DATA_NONINIT_SUB, USB_DMA_DATA_INIT_SUB and USB_CONTROLLER_DATA; These structures are defined in the usb stack by default. We can see these structures in usb_device_ehci.c and usb_host_ehci.c (take usb host as an example).   In usb_device_ehci.c /* Apply for QH buffer, 2048-byte alignment */ USB_RAM_ADDRESS_ALIGNMENT(2048) USB_CONTROLLER_DATA static uint8_t qh_buffer[(USB_DEVICE_CONFIG_EHCI - 1) * 2048 +   2 * USB_DEVICE_CONFIG_ENDPOINTS * 2 * sizeof(usb_device_ehci_qh_struct_t)]; /* Apply for DTD buffer, 32-byte alignment */ USB_RAM_ADDRESS_ALIGNMENT(32) USB_CONTROLLER_DATA static usb_device_ehci_dtd_struct_t s_UsbDeviceEhciDtd[USB_DEVICE_CONFIG_EHCI]                                                                        [USB_DEVICE_CONFIG_EHCI_MAX_DTD];  In usb_host_ehci.c  /* EHCI controller driver instances. */ #if (USB_HOST_CONFIG_EHCI == 1U) USB_RAM_ADDRESS_ALIGNMENT(4096) USB_CONTROLLER_DATA static uint8_t s_UsbHostEhciFrameList1[USB_HOST_CONFIG_EHCI_FRAME_LIST_SIZE * 4]; static uint8_t usbHostEhciFramListStatus[1] = {0};   USB_RAM_ADDRESS_ALIGNMENT(64) USB_CONTROLLER_DATA static usb_host_ehci_data_t s_UsbHostEhciData1; #elif (USB_HOST_CONFIG_EHCI == 2U) USB_RAM_ADDRESS_ALIGNMENT(4096) USB_CONTROLLER_DATA static uint8_t s_UsbHostEhciFrameList1[USB_HOST_CONFIG_EHCI_FRAME_LIST_SIZE * 4]; USB_RAM_ADDRESS_ALIGNMENT(4096) USB_CONTROLLER_DATA static uint8_t s_UsbHostEhciFrameList2[USB_HOST_CONFIG_EHCI_FRAME_LIST_SIZE * 4]; static uint8_t usbHostEhciFramListStatus[2] = {0, 0}; USB_RAM_ADDRESS_ALIGNMENT(64) USB_CONTROLLER_DATA static usb_host_ehci_data_t s_UsbHostEhciData1; USB_RAM_ADDRESS_ALIGNMENT(64) USB_CONTROLLER_DATA static usb_host_ehci_data_t s_UsbHostEhciData2; #else #error "Please increase the instance count." #endif     2    Linker file : partition a RAM block from OCRAM for non-cacheable buffers         Using managed linker script to configure memory RAM2 as a non-cacheable area.   3    MPU configuratins   ( board.c )  MPU divides the memory map into a few regions, and defines the memory attributes of each region. In this step, we need to configure the SRAM_OC_NCACHE_128(RAM2) as non-cacheable      /* Region 13 setting: Memory with  non-cacheable */     MPU->RBAR = ARM_MPU_RBAR(13, 0x202a0000);     MPU->RASR = ARM_MPU_RASR(0, ARM_MPU_AP_FULL, 1, 0, 0, 0, 0, ARM_MPU_REGION_SIZE_128KB);   Now, SRAM_OC_NCACHE_128 (RAM2) is a non-cacheable section. Variables in  *(NonCacheable.init) and  *( NonCacheable) will be put to SRAM_OC_NCACHE_128.   4   Put USB variables into SRAM_OC_NCACHE_128(RAM2)  This is done by the following macros.  #define USB_LINK_NONCACHE_NONINIT_DATA  _Pragma("location = \"NonCacheable\"")  Relative source code is in file usb_misc.h   #if (defined(DATA_SECTION_IS_CACHEABLE) && (DATA_SECTION_IS_CACHEABLE)) #define USB_GLOBAL USB_LINK_NONCACHE_NONINIT_DATA #define USB_BDT USB_LINK_NONCACHE_NONINIT_DATA #define USB_DMA_DATA_NONINIT_SUB USB_LINK_NONCACHE_NONINIT_DATA #define USB_DMA_DATA_INIT_SUB USB_LINK_DMA_INIT_DATA(NonCacheable.init) #define USB_CONTROLLER_DATA USB_LINK_NONCACHE_NONINIT_DATA #else #define USB_GLOBAL USB_LINK_USB_GLOBAL_BSS #define USB_BDT USB_LINK_USB_BDT_BSS #define USB_DMA_DATA_NONINIT_SUB #define USB_DMA_DATA_INIT_SUB #define USB_CONTROLLER_DATA #endif   Please put macro “DATA_SECTION_IS_CACHEABLE=1” in the preprocessor define.     5    build and run project  evkmimxrt1060_host_msd_command_freertos, success!  Reference: Using the i.MXRT L1 Cache https://www.nxp.com.cn/docs/en/application-note/AN12042.pdf             
View full article
[中文翻译版] 见附件   原文链接: https://community.nxp.com/t5/i-MX-RT-Knowledge-Base/Design-an-IoT-edge-node-for-CV-application-base-on-the-i/ta-p/1127423 
View full article
[中文翻译版] 见附件   原文链接: https://community.nxp.com/t5/i-MX-Community-Articles/Effortless-GUI-Development-with-NXP-Microcontrollers/ba-p/1131179  
View full article
[中文翻译版] 见附件   原文链接: https://community.nxp.com/docs/DOC-345190  
View full article
[中文翻译版] 见附件   原文链接: https://community.nxp.com/t5/eIQ-Machine-Learning-Software/eIQ-on-i-MX-RT1064-EVK/ta-p/1123602 
View full article
[中文翻译版] 见附件   原文链接: https://community.nxp.com/t5/i-MX-RT-Knowledge-Base/RT1050-HAB-Encrypted-Image-Generation-and-Analysis/ta-p/1124877  
View full article
In the tutorial, I'd like to show the steps of deploying an image classification model on i.MX RT1060 to enabling you to classify fashion images and categories. In the first part of this tutorial, we will review the Fashion MNIST dataset, including how to download it to your system. From there we’ll define a simple CNN network using the TensorFlow platform. Next, we’ll train our CNN model on the Fashion MNIST dataset, train it, and review the results. Finally, we'll optimize the model, after that, the model will be smaller and increase inferencing speed, which is valuable for source-limited devices such as MCU. Let’s go ahead and get started! Fashion MNIST dataset The Fashion MNIST dataset was created by the e-commerce company, Zalando. Fig 1 Fashion MNIST dataset As they note on their official GitHub repo for the Fashion MNIST dataset, there are a few problems with the standard MNIST digit recognition dataset: It’s far too easy for standard machine learning algorithms to obtain 97%+ accuracy. It’s even easier for deep learning models to achieve 99%+ accuracy. The dataset is overused. MNIST cannot represent modern computer vision tasks. Zalando, therefore, created the Fashion MNIST dataset as a drop-in replacement for MNIST. 60,000 training examples 10,000 testing examples 10 classes: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, Ankle boot 28×28 grayscale images The code below loads the Fashion-MNIST dataset using the TensorFlow and creates a plot of the first 25 images in the training dataset. import tensorflow as tf import numpy as np # For easy reset of notebook state. tf.keras.backend.clear_session() # load dataset fashion_mnist = tf.keras.datasets.fashion_mnist (train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data() lass_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'] plt.figure(figsize=(8,8)) for i in range(25): plt.subplot(5,5,i+1,) plt.tight_layout() plt.imshow(train_images[i]) plt.xlabel(lass_names[train_labels[i]]) plt.xticks([]) plt.yticks([]) plt.grid(False) plt.show() Fig 2 Running the code loads the Fashion-MNIST train and test dataset and prints their shape. Fig 3 We can see that there are 60,000 examples in the training dataset and 10,000 in the test dataset and that images are indeed square with 28×28 pixels. Creating model We need to define a neural network model for the image classify purpose, and the model should have two main parts: the feature extraction and the classifier that makes a prediction. Defining a simple Convolutional Neural Network (CNN) For the convolutional front-end, we build 3 layers of convolution layer with a small filter size (3,3) and a modest number of filters followed by a max-pooling layer. The last filter map is flattened to provide features to the classifier. As we know, it's a multi-class classification task, so we will require an output layer with 10 nodes in order to predict the probability distribution of an image belonging to each of the 10 classes. In this case, we will require the use of a softmax activation function. And between the feature extractor and the output layer, we can add a dense layer to interpret the features. All layers will use the ReLU activation function and the He weight initialization scheme, both best practices. We will use the Adam optimizer to optimize the sparse_categorical_crossentropy loss function, suitable for multi-class classification, and we will monitor the classification accuracy metric, which is appropriate given we have the same number of examples in each of the 10 classes. The below code will define and run it will show the struct of the model. # Define a Model model = tf.keras.models.Sequential() # First Convolution ,Kernel:16*3*3 model.add( tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_uniform',input_shape=(28, 28, 1))) model.add( tf.keras.layers.MaxPooling2D((2, 2))) # Second Convolution ,Kernel:32*3*3 model.add( tf.keras.layers.Conv2D(32, (3, 3), activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.MaxPooling2D((2, 2))) # Third Convolution ,Kernel:32*3*3 model.add( tf.keras.layers.Conv2D(32, (3, 3), activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.Flatten()) model.add( tf.keras.layers.Dense(32, activation='relu',kernel_initializer='he_uniform')) model.add( tf.keras.layers.Dense(10, activation='softmax')) Fig 4 Training Model After the model is defined, we need to train it. The model will be trained using 5-fold cross-validation. The value of k=5 was chosen to provide a baseline for both repeated evaluation and to not be too large as to require a long running time. Each validation set will be 20% of the training dataset or about 12,000 examples. The training dataset is shuffled prior to being split and the sample shuffling is performed each time so that any model we train will have the same train and validation datasets in each fold, providing an apples-to-apples comparison. We will train the baseline model for a modest 20 training epochs with a default batch size of 32 examples. The validation set for each fold will be used to validate the model during each epoch of the training run, so we can later create learning curves, and at the end of the run, we use the test dataset to estimate the performance of the model. As such, we will keep track of the resulting history from each run, as well as the classification accuracy of the fold. The train_model() function below implements these behaviors, taking the training dataset and test dataset as arguments, and returning a list of accuracy scores and training histories that can be later summarized. from sklearn.model_selection import KFold # train a model using k-fold cross-validation def train_model(dataX, dataY, n_folds=5): scores, histories = list(), list() # prepare cross validation kfold = KFold(n_folds, shuffle=True, random_state=1) for train_ix, validate_ix in kfold.split(dataX): # select rows for train and test trainX, trainY, validate_X, validate_Y = dataX[train_ix], dataY[train_ix], dataX[validate_ix], dataY[validate_ix] # fit model history = model.fit(trainX, trainY, epochs=20, batch_size=32, validation_data=(validate_X, validate_Y), verbose=0) # evaluate model _, acc = model.evaluate(validate_X, validate_Y, verbose=0) print("Accurary: {:.4f},Total number of figures is {:0>2d}".format(acc * 100.0, len(testY))) # append scores scores.append(acc) histories.append(history) return scores, histories Module Summary After the model has been trained, we can present the results. There are two key aspects to present: the diagnostics of the learning behavior of the model during training and the estimation of the model performance. These can be implemented using separate functions. First, the diagnostics involve creating a line plot showing model performance on the train and validate set during each fold of the k-fold cross-validation. These plots are valuable for getting an idea of whether a model is overfitting, underfitting, or has a good fit for the dataset. We will create a single figure with two subplots, one for loss and one for accuracy. Blue lines will indicate model performance on the training dataset and orange lines will indicate performance on the hold-out validate dataset. The summarize_diagnostics() function below creates and shows this plot given the collected training histories. # plot diagnostic learning curves def summarize_diagnostics(histories): for i in range(len(histories)): # plot loss plt.subplot(2,1,1) plt.title('Cross Entropy Loss') plt.plot(histories[i].history['loss'], color='blue', label='train') plt.plot(histories[i].history['val_loss'], color='orange', label='test') # plot accuracy plt.subplot(2,1,2) plt.title('Classification Accuracy') plt.plot(histories[i].history['accuracy'], color='blue', label='train') plt.plot(histories[i].history['val_accuracy'], color='orange', label='test') plt.show() Fig 5 Next, the classification accuracy scores collected during each fold can be summarized by calculating the mean and standard deviation. This provides an estimate of the average expected performance of the model trained on the test dataset, with an estimate of the average variance in the mean. We will also summarize the distribution of scores by creating and showing a box and whisker plot. The summarize_performance() function below implements this for a given list of scores collected during model training. # summarize model performance def summarize_performance(scores): # print summary print('Accuracy: mean={:.4f} std={:.4f}, n={:0>2d}'.format(np.mean(trained_scores)*100, np.std(trained_scores)*100, len(scores))) # box and whisker plots of results plt.boxplot(scores) plt.show()   Fig 6 Verifying predictions According to the above figure, we see that the final trained model can get up to around 87.6% accuracy when predicting the test dataset. And with the trained model, running the below code will demonstrate the result of predictions about some images. def plot_image(i, predictions_array, true_label, img): true_label, img = true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([]) plt.imshow(img.reshape(28, 28), cmap=plt.cm.binary) predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red' plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color) def plot_value_array(i, predictions_array, true_label): true_label = true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array) thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue') predictions = model.predict(test_images) # Plot the first X test images, their predicted labels, and the true labels. # Color correct predictions in blue and incorrect predictions in red. num_rows = 5 num_cols = 3 num_images = num_rows*num_cols plt.figure(figsize=(2*2*num_cols, 2*num_rows)) for i in range(num_images): plt.subplot(num_rows, 2*num_cols, 2*i+1) plot_image(i, predictions[i], test_labels, test_images) plt.subplot(num_rows, 2*num_cols, 2*i+2) plot_value_array(i, predictions[i], test_labels) plt.tight_layout() plt.show()   Fig 7 Model quantization Post-training quantization is a conversion technique that can reduce model size while also improving CPU and hardware accelerator latency, with little degradation in model accuracy, especially it's crucial to embedded platforms, as it lacks the compute-intensive performance, the Flash and RAM memory is also very limited. TensorFlow Lite is able to be used to convert an already-trained float TensorFlow model to the TensorFlow Lite format. In addition, the TensorFlow Lite provides several approaches to optimize the mode, among these ways, Integer quantization is an optimization strategy that converts 32-bit floating-point numbers (such as weights and activation outputs) to the nearest 8-bit fixed-point numbers. This results in a smaller model and increased inferencing speed, which is very valuable for low-power devices such as microcontrollers. The below codes show how to implement the Integer quantization of the trained model, and after running these codes, we can find that the size of Tensorflow Lite mode reduces almost 64.9 KB versus the original model, becomes about 32% of the original size(Fig 8). import os # Convert using integer-only quantization def representative_data_gen(): for input_value in tf.data.Dataset.from_tensor_slices(tf.cast(train_images,tf.float32)).shuffle(500).batch(1).take(150): yield [input_value] # Convert using dynamic range quantization converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] tflite_model_quant = converter.convert() # Save the model to disk open("model_dynamic_range_quantization.tflite", "wb").write(tflite_model_quant) ## Size difference Dynamic_range_quantization_model_size = os.path.getsize("model_dynamic_range_quantization.tflite") print("Dynamic range quantization model is %d bytes" % Dynamic_range_quantization_model_size) converter = tf.lite.TFLiteConverter.from_keras_model(model) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_data_gen # Ensure that if any ops can't be quantized, the converter throws an error converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] # Set the input and output tensors to uint8 (APIs added in r2.3) converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8 tflite_model_advanced_quant = converter.convert() # Save the model to disk open("model_integer_only_quantization.tflite", "wb").write(tflite_model_advanced_quant) Integer_only_quantization_model_size = os.path.getsize("model_integer_only_quantization.tflite") print("Integer_only_quantization_model is %d bytes" % Integer_only_quantization_model_size) difference = Dynamic_range_quantization_model_size - Integer_only_quantization_model_size print("Difference is %d bytes" % difference) Fig 8 Evaluating the TensorFlow Lite model Now we'll run inferences using the TensorFlow Lite Interpreter to compare the model accuracies. First, we need a function that runs inference with a given model and images, and then returns the predictions: # Helper function to run inference on a TFLite model def run_tflite_model(tflite_file, test_image_indices): # Initialize the interpreter interpreter = tf.lite.Interpreter(model_path=str(tflite_file)) interpreter.allocate_tensors() input_details = interpreter.get_input_details()[0] output_details = interpreter.get_output_details()[0] predictions = np.zeros((len(test_image_indices),), dtype=int) for i, test_image_index in enumerate(test_image_indices): test_image = test_images[test_image_index] test_label = test_labels[test_image_index] # Check if the input type is quantized, then rescale input data to uint8 if input_details['dtype'] == np.uint8: input_scale, input_zero_point = input_details["quantization"] test_image = test_image / input_scale + input_zero_point test_image = np.expand_dims(test_image, axis=0).astype(input_details["dtype"]) interpreter.set_tensor(input_details["index"], test_image) interpreter.invoke() output = interpreter.get_tensor(output_details["index"])[0] predictions[i] = output.argmax() return predictions Next, we'll compare the performance of the original model and the quantized model on one image. model_basic_quantization.tflite is the original TensorFlow Lite model with floating-point data. model_integer_only_quantization.tflite is the last model we converted using integer-only quantization (it uses uint8 data for input and output). Let's create another function to print our predictions and run it for testing. import matplotlib.pylab as plt # Change this to test a different image test_image_index = 1 ## Helper function to test the models on one image def test_model(tflite_file, test_image_index, model_type): global test_labels predictions = run_tflite_model(tflite_file, [test_image_index]) plt.imshow(test_images[test_image_index].reshape(28,28)) template = model_type + " Model \n True:{true}, Predicted:{predict}" _ = plt.title(template.format(true= str(test_labels[test_image_index]), predict=str(predictions[0]))) plt.grid(False) Fig 9 Fig 10 Then evaluate the quantized model by using all the test images we loaded at the beginning of this tutorial. After summarizing the prediction result of the test dataset, we can see that the prediction accuracy of the quantized model decrease 7% less than the original model, it's not bad. # Helper function to evaluate a TFLite model on all images def evaluate_model(tflite_file, model_type): test_image_indices = range(test_images.shape[0]) predictions = run_tflite_model(tflite_file, test_image_indices) accuracy = (np.sum(test_labels== predictions) * 100) / len(test_images) print('%s model accuracy is %.4f%% (Number of test samples=%d)' % ( model_type, accuracy, len(test_images))) Deploying model Converting TensorFlow Lite model to C file The following code runs xxd on the quantized model, writes the output to a file called model_quantized.cc, in the file, the model is defined as an array of bytes, and prints it to the screen. The output is very long, so we won’t reproduce it all here, but here’s a snippet that includes just the beginning and end. # Save the file as a C source file xxd -i model_integer_only_quantization.tflite > model_quantized.cc # Print the source file cat model_quantized.cc Fig 11 Deploying the C file to project We use the tensorflow_lite_cifar10 demo as a prototype, then replace the original model and do some code modification, below is the code in the modified main file. #include "board.h" #include "fsl_debug_console.h" #include "pin_mux.h" #include "timer.h" #include <iomanip> #include <iostream> #include <string> #include <vector> #include "tensorflow/lite/kernels/register.h" #include "tensorflow/lite/model.h" #include "tensorflow/lite/optional_debug_tools.h" #include "tensorflow/lite/string_util.h" #include "get_top_n.h" #include "model.h" #define LOG(x) std::cout // ---------------------------- Application ----------------------------- // Lenet Mnist model input data size (bytes). #define LENET_MNIST_INPUT_SIZE 28*28*sizeof(char) // Lenet Mnist model number of output classes. #define LENET_MNIST_OUTPUT_CLASS 10 // Allocate buffer for input data. This buffer contains the input image // pre-processed and serialized as text to include here. uint8_t imageData[LENET_MNIST_INPUT_SIZE] = { #include "clothes_select.inc" }; /* Tresholds */ #define DETECTION_TRESHOLD 60 /*! * @brief Initialize parameters for inference * * @param reference to flat buffer * @param reference to interpreter * @param pointer to storing input tensor address * @param verbose mode flag. Set true for verbose mode */ void InferenceInit(std::unique_ptr<tflite::FlatBufferModel> &model, std::unique_ptr<tflite::Interpreter> &interpreter, TfLiteTensor** input_tensor, bool isVerbose) { model = tflite::FlatBufferModel::BuildFromBuffer(Fashion_MNIST_model, Fashion_MNIST_model_len); if (!model) { LOG(FATAL) << "Failed to load model\r\n"; return; } tflite::ops::builtin::BuiltinOpResolver resolver; tflite::InterpreterBuilder(*model, resolver)(&interpreter); if (!interpreter) { LOG(FATAL) << "Failed to construct interpreter\r\n"; return; } int input = interpreter->inputs()[0]; const std::vector<int> inputs = interpreter->inputs(); const std::vector<int> outputs = interpreter->outputs(); if (interpreter->AllocateTensors() != kTfLiteOk) { LOG(FATAL) << "Failed to allocate tensors!"; return; } /* Get input dimension from the input tensor metadata assuming one input only */ *input_tensor = interpreter->tensor(input); auto data_type = (*input_tensor)->type; if (isVerbose) { const std::vector<int> inputs = interpreter->inputs(); const std::vector<int> outputs = interpreter->outputs(); LOG(INFO) << "input: " << inputs[0] << "\r\n"; LOG(INFO) << "number of inputs: " << inputs.size() << "\r\n"; LOG(INFO) << "number of outputs: " << outputs.size() << "\r\n"; LOG(INFO) << "tensors size: " << interpreter->tensors_size() << "\r\n"; LOG(INFO) << "nodes size: " << interpreter->nodes_size() << "\r\n"; LOG(INFO) << "inputs: " << interpreter->inputs().size() << "\r\n"; LOG(INFO) << "input(0) name: " << interpreter->GetInputName(0) << "\r\n"; int t_size = interpreter->tensors_size(); for (int i = 0; i < t_size; i++) { if (interpreter->tensor(i)->name) { LOG(INFO) << i << ": " << interpreter->tensor(i)->name << ", " << interpreter->tensor(i)->bytes << ", " << interpreter->tensor(i)->type << ", " << interpreter->tensor(i)->params.scale << ", " << interpreter->tensor(i)->params.zero_point << "\r\n"; } } LOG(INFO) << "\r\n"; } } /*! * @brief Runs inference input buffer and print result to console * * @param pointer to image data * @param image data length * @param pointer to labels string array * @param reference to flat buffer model * @param reference to interpreter * @param pointer to input tensor */ void RunInference(const uint8_t* image, size_t image_len, const std::string* labels, std::unique_ptr<tflite::FlatBufferModel> &model, std::unique_ptr<tflite::Interpreter> &interpreter, TfLiteTensor* input_tensor) { /* Copy image to tensor. */ memcpy(input_tensor->data.uint8, image, image_len); /* Do inference on static image in first loop. */ auto start = GetTimeInUS(); if (interpreter->Invoke() != kTfLiteOk) { LOG(FATAL) << "Failed to invoke tflite!\r\n"; return; } auto end = GetTimeInUS(); const float threshold = (float)DETECTION_TRESHOLD /100; std::vector<std::pair<float, int>> top_results; int output = interpreter->outputs()[0]; TfLiteTensor *output_tensor = interpreter->tensor(output); TfLiteIntArray* output_dims = output_tensor->dims; // assume output dims to be something like (1, 1, ... , size) auto output_size = output_dims->data[output_dims->size - 1]; /* Find best image candidates. */ GetTopN<uint8_t>(interpreter->typed_output_tensor<uint8_t>(0), output_size, 1, threshold, &top_results, false); if (!top_results.empty()) { auto result = top_results.front(); const float confidence = result.first; const int index = result.second; if (confidence * 100 > DETECTION_TRESHOLD) { LOG(INFO) << "----------------------------------------\r\n"; LOG(INFO) << " Inference time: " << (end - start) / 1000 << " ms\r\n"; LOG(INFO) << " Detected: " << std::setw(10) << labels[index] << " (" << (int)(confidence * 100) << "%)\r\n"; LOG(INFO) << "----------------------------------------\r\n\r\n"; } } } /*! * @brief Main function */ int main(void) { const std::string labels[] = {"T-shirt/top", "Trouser","Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"}; /* Init board hardware. */ BOARD_ConfigMPU(); BOARD_InitPins(); BOARD_BootClockRUN(); BOARD_InitDebugConsole(); InitTimer(); std::unique_ptr<tflite::FlatBufferModel> model; std::unique_ptr<tflite::Interpreter> interpreter; TfLiteTensor* input_tensor = 0; InferenceInit(model, interpreter, &input_tensor, false); LOG(INFO) << "Fashion MNIST object recognition example using a TensorFlow Lite model.\r\n"; LOG(INFO) << "Detection threshold: " << DETECTION_TRESHOLD << "%\r\n"; /* Run inference on static ship image. */ LOG(INFO) << "\r\nStatic data processing:\r\n"; RunInference((uint8_t*)imageData, (size_t)LENET_MNIST_INPUT_SIZE, labels, model, interpreter, input_tensor); while(1) {} } Testing result After deploying the model in the demo project, then we'll run this demo on the MIMXRT1060 (Fig 12) board for testing. Fig 12 Run the below code to covert the Fashion MNIST image to text The process_image() function can convert a Fashion MNIST image to an include file as static data, then include this file in the demo project. def process_image(image, output_path, num_batch=1): img_data = np.transpose(image, (2, 0, 1)) # Repeat image for batch processing (resulting tensor is NCHW or NHWC) img_data = np.reshape(img_data, (num_batch, img_data.shape[0], img_data.shape[1], img_data.shape[2])) img_data = np.repeat(img_data, num_batch, axis=0) img_data = np.reshape(img_data, (num_batch, img_data.shape[1], img_data.shape[2], img_data.shape[3])) # Serialize image batch img_data_bytes = bytearray(img_data.tobytes(order='C')) image_bytes_per_line = 20 with open(output_path, 'wt') as f: idx = 0 for byte in img_data_bytes: f.write('0X%02X, ' % byte) if idx % image_bytes_per_line == (image_bytes_per_line - 1): f.write('\n') idx = idx + 1 # Return serialized image size return len(img_data_bytes)      2. Run the demo project on board.
View full article