MCX Microcontrollers Knowledge Base

ディスカッション

These two lab guides provide step-by-step instructions on how to take a quantized TensorFlow Lite model and use the Neutron Conversion Tool found in eIQ Toolkit to convert the model to run on the eIQ Neutron NPU found on MCX N devices. The eIQ Neutron NPU for MCUs Lab Guide - Part 1 - Mobilenet.pdf document focuses on using the eIQ Toolkit GUI method to convert a model and then import that converted model into an eIQ MCUXpresso SDK example. It is recommended to go through this lab first. The eIQ Neutron NPU for MCUs Lab Guide - Part 2 - Face Detect.pdf document focuses on using the eIQ Toolkit command line utilities to convert a model for the eIQ Neutron NPU and integrate that newly converted model into the Face Detect demo found on the Application Code Hub. Both labs were designed to run on the FRDM-MCXN947 but the same concepts can be applied to other MCX N boards. Also be sure to also check out the Getting Started community post for more details on the eIQ Neutron NPU. --- Updated May 2024 for eIQ Toolkit 1.11.4

This guide helps you to: Get familiar with NPU inside the MCX Know eIQ ML examples included in SDK. Summarize related application notes and demo projects on GitHub Prerequisites: Windows 10 development PC. FRDM-MCXN947 board NXP’s MCUXpresso IDE is installed on the development PC Generate and download FRDM-MCXN947 SDK package from the web-based MCUXpresso SDK builder. Introduction to NPU The MCXN94x and MCXN54x are based on dual high-performance Arm® Cortex®-M33 cores running at up to 150 MHz, it has 2MB of on-chip Flash with optional full ECC RAM and an integrated proprietary NPU. The integrated NPU delivers up to 40x faster machine learning (ML) throughput compared to a CPU core, enabling it to spend less time awake and reducing overall power consumption. The architecture provides power and performance-optimized NPUs integrated with NXP's very wide portfolio of microcontrollers and applications processors. The eIQ Neutron NPUs offer support for a wide variety of neural network types such as CNN, RNN, TCN, and Transformer networks and more. ML application development with the eIQ Neutron NPU is fully supported by the eIQ machine learning software development environment. The NPU used in MCXN94 is Neutron N1-16, its block diagram is shown in the below figure. The eIQ Neutron N1-16 NPU found inside the MCXN94 has 4 compute pipes and each compute pipe contains 4 INT8 MAC (Multiply Accumulate) blocks for a total of 16 MAC blocks. This means that MCXN94 could execute 4.8G(150MHz * 4 * 4 * 2) INT8 operations per second. The MCUXpresso Software Development Kit (MCUXpresso SDK) provides a comprehensive software package with a pre-integrated TensorFlow Lite for Microcontrollers (TFLM). The Neutron library is integrated into TFLM as well. The following table shows the operators which are supported by the NPU. Operator Operator input type MCXN947/MCXN548 NPU ADD Float No Uint8(PTQ) No Int8(PCQ) Yes AVERAGE_POOL_2D Float No Uint8(PTQ) No Int8(PCQ) Yes CONV_2D Float No Uint8(PTQ) No Int8(PCQ) Yes DEPTHWISE_CONV_2D Float No Uint8(PTQ) No Int8(PCQ) Yes FULLY_CONNECTED Float No Uint8(PTQ) No Int8(PCQ) Yes UNIDIRECTIONAL_SEQUENCE_ LSTM Float No Uint8(PTQ) No Int8(PCQ) No LOGISTIC (Sigmoid) Float No Uint8(PTQ) No Int8(PCQ) Yes MAX_POOL_2D Float No Uint8(PTQ) No Int8(PCQ) Yes MUL Float No Uint8(PTQ) No Int8(PCQ) No SOFTMAX Float No Uint8(PTQ) No Int8(PCQ) No SVDF Float No Uint8(PTQ) No Int8(PCQ) No Note: PTQ — Per-tensor quantized (asymmetric 8-bit quantization). PCQ — Per-channel quantized (symmetric 8-bit quantization). For more information please refer to eIQ TensorFlow Lite User's Guide.pdf in middleware/eiq/doc of SDK. eIQ ML examples included in SDK. Download SDK and select FRDM-MCXN947 in MCUxpresso SDK Builder, Remember to select eIQ middleware. Then open MCUXpresso, it’s convenient to install the SDK by dragging and dropping the file into the SDK installation window of the MCUXpresso IDE. Import SDK examples into the workspace: There are 7 ML examples included in SDK: Here are the descriptions of the eIQ examples: eIQ example Description Hardware requirements tflm_cifar10 CIFAR10 example based on TensorFlow Lite Micro, recognizes a static image FRDM-MCXN947 USB type-c cable tflm_kws Keyword spotting example based on TensorFlow Lite Micro recognizes a static WAV audio FRDM-MCXN947 USB type-c cable tflm_label_image Label 1000 classes of images based on TensorFlow Lite Micro FRDM-MCXN947 USB type-c cable mpp_camera_mobilenet_view_tflm Label camera images based on TensorFlow Lite Micro FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable mpp_camera_ultraface_view_tflm Face detection using the camera as the source, based on TensorFlow Lite Micro FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable mpp_camera_view A simple camera preview pipeline. FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable tflm_modelrunner TFLite Model Benchmark example for Microcontrollers. FRDM-MCXN947 RJ45 Network cable For additional information and guidance, kindly refer to the README file located in the doc folder that accompanies each example. Summary of related application notes There are two application notes available that provide information on advanced usage of NPU. How to Integrate Customer ML Model to NPU on MCXN94x Face Detection demo with NPU accelerated on MCXN947. Please find the notes on the NXP website Summary of demo projects on GitHub There are five demos on nxp-appcodehub from GitHub: Multiple Face detection on FRDM-MCXN947 Multiple person detection on FRDM-MCXN947 Label CIFAR10 images on FRDM-MCXN947 Fashion-MNIST recognition on FRDM-MCXN947 NPU vs TensorFLM benchmark on MCX

Part 1: Introduction The eIQ Neutron Neural Processing Unit (NPU) is a highly scalable accelerator core architecture that provides machine learning (ML) acceleration. Compared to traditional MCUs like the Kinetis series and LPC series, the MCX N series marks the first integration of NXP's eIQ® Neutron NPU for ML acceleration. The eIQ Neutron NPU offers up to 42 times faster machine learning inference performance compared to a standalone CPU core. Specifically, the MCX N94 can execute 4.8G (150MHz * 4 * 4 * 2) INT8 operations per second. The eIQ Portal, developed in exclusive partnership with Au-Zone Technologies, is an intuitive graphical user interface (GUI) that simplifies vision based ML solutions development. Developers can create, optimize, debug and export ML models, as well as import datasets and models, rapidly train and deploy neural network models and ML workloads for vision applications. Hardware Environment: Development Board: FRDM-MCXN947 Display: 3.5" TFT LCD (Part Number: PAR-LCD-S035) Camera: OV7670 Software Environment: eIQ Portal: eIQ® ML Software Development Environment | NXP Semiconductors MCUXpresso IDE v11.9.0 Application Code Hub Demo：Label CIFAR10 image Part 2: Basic Model Classification Training and Deployment The main content is divided into three steps: model training, model converting, and model deployment. 1. Dataset Preparation The dataset is prepared for a simple demonstration of binary classification between apples and bananas. The training set and test set are split in an 8:2 ratio. This follows the guidelines mentioned in section 3.3.2 Structured Folders Dataset of the eIQ_Toolkit_UG.pdf: The folder structure is as follows: Note: The dataset needs to be organized according to the above folder structure. 2. Create Project and Import Dataset into eIQ a. Open the eIQ Portal tool, click on "CREATE PROJECT" -> "Import dataset". b. Import using "Structured folders" as follows: c. After clicking "IMPORT", select the project save path and click "Save". 3. Select Base Model for Training a. After the dataset is imported, click on "SELECT MODEL" and choose a base model. Modify the "Input Size" to 128, 128, 3. b. Click on "Start Training". Note: Other parameters can be set according to your needs, and here the "learning rate", "batch size", and "epoch" are set to their default values. This is a demonstration and trains the model for one epoch. Users can train the model as needed to meet application requirements. Upon completion of training, it will look like this: If the accuracy does not meet the required standard, you can modify the training parameters or update the training data, and then click "CONTINUE TRAINING" to continue the training process. 4. Model Evaluation "VALIDATE" a. Click on "VALIDATE" to enter the model evaluation stage. Set the parameters including "Softmax Threshold", "Input Data Type", and "Output Data Type". Currently, the MCXN series Neutron NPU only supports the int8 data type. It should look like this: b.After setting the parameters, click on "VALIDATE" and wait for the confusion matrix to be generated. The confusion matrix provides a clear view of how different categories are classified. In the diagram, the x-axis represents the predicted labels, while the y-axis represents the actual labels. You can see the correspondence between the predicted and actual labels for each image, as shown below: 5. Model Export to TensorFlow Lite a. Click on "DEPLOY", set the "Export File Type", "Input Data Type", and "Output Data Type". Turn on "Export Quantized Model", and then click on "EXPORT MODEL", as shown below: b. Set the location to save the model and click "Save". 6. Convert to TensorFlow Lite for Neutron (.tflite) a. After saving, click on "OPEN MODEL" to view the model structure, as shown below: b. Click on "Convert" and select "TensorFlow Lite for Neutron (.tflite)" as shown below: c. Select the "Neutron target", click on "Convert", and set the save path. It should look like this: 7. Deploy the Model to the Label CIFAR10 Image Project This example is based on a machine learning algorithm supported by the MCXN947, which can label images captured from a camera and display the type of object at the bottom of the LCD. The model is trained on the CIFAR10 dataset, which supports 10 categories of images: "Airplane", "Automobile", "Bird", "Cat", "Deer", "Dog", "Frog", "Horse", "Ship", "Truck". a. Open MCUXpresso IDE and import the Label CIFAR10 Image project from the Application Code Hub, as follows: b. Select the project, click on "GitHub link" -> "Next", as shown below: c. Set the save path, click "Next" -> "Next" -> "Finish", as shown below: d. After successful import, click on the "source" folder -> "model" folder, open "model_data.s", and copy the model file converted through eIQ into the "model" folder. Modify the name of the imported model (the name of the converted model) in "model_data.s", as shown below: Note: The model imported into the project is the one obtained after multiple training sessions. e. Click on the "source" folder -> "model" folder -> open the "labels.h" file. Modify the "labels[]" array to match the order of labels displayed in the dataset in eIQ, as shown below: f. Compile the project and download it to the development board. Part 3: Experimental Results Part 4: Summary By efficiently utilizing the powerful performance of the eIQ Neutron NPU and the convenient tools of the eIQ Portal, developers can significantly streamline the entire process from model training to deployment. This not only accelerates the development cycle of machine learning applications but also enhances their performance and reliability. Therefore, for developers looking to implement efficient machine learning applications on MCX N-series edge devices, mastering these technologies and tools is crucial. We can also refer to the video for detailed steps. https://www.bilibili.com/video/BV1SS411N7Hv?t=12.9

When deploying a custom built model to replace the default models in MCUXpresso SDK examples, there are several modifications that need to be made as described in the eIQ Neutron NPU hands-on labs. Here are some common issues and error messages that you might encounter when using a new custom model with the SDK examples and how to solve them. If there is an issue not covered here, then please make a new thread to discuss that issue. “Didn't find op for builtin opcode ‘<operator_name>’” Need to add that operator to MODEL_GetOpsResolver function found in source\model\model_name_ops_npu.cpp A full list of operators used by a model that can be copy-and-pasted into that file is automatically generated by Neutron Converter Tool with the dump-header-file option. Make sure to also increase the size of the static array s_microOpResolver to match the number of operators “resolver size is too small” Need to increase the size of the static array s_microOpResolver in MODEL_GetOpsResolver function found in source\model\model_name_ops_npu.cpp to match the number of operators “Failed to resize buffer” The scratch memory buffer is too small for the model and needs to be increased. The size of the memory buffer is set with the kTensorArenaSize variable found in the model data header file “Incompatible Neutron NPU microcode and driver versions!” Ensure the version of the eIQ Neutron Converter Tool used to convert the model is the correct one that is compatible with the NPU libraries used by the SDK project. eIQ Toolkit v1.10.0 should be used with MCXUpresso SDK for MCX N 2.14.0. The Neutron Converter Tool version is 1.2.0+0X84d37e1f Camera colors are incorrect on FRDM-MCXN947 board Modify solder jumpers SJ16, SJ26, and SJ27 on the back of board to move them to the left (dashed line side) to connect camera signals properly. This modification will disable Ethernet functionality on the board due to a signal conflict with EZH D0 and ENET_TXCLK. If your project needs both camera and Ethernet functionality, then only move SJ16 and SJ26 to the left (dashed line side) and then connect a wire from P1_4 (J9 pin 😎 to the left side of R58. Then in the pin_mux.c file in the project, instead of using PORT1_PCR4 for EZH_Camera_D0, use PORT3_PCR0.

MCX Microcontrollers Knowledge Base

MCX Microcontrollers Knowledge Base

ディスカッション

eIQ Neutron NPU Lab Guides

How to get started with NPU and ML in MCX Microcontrollers

MCXN947: How to Train and Deploy Customer ML model to NPU

Tips When Importing Custom Models into MCX eIQ Neutron NPU SDK Examples