How to get started with NPU and ML in MCX Microcontrollers

Yes

This guide helps you to:

Get familiar with NPU inside the MCX
Know eIQ ML examples included in SDK.
Summarize related application notes and demo projects on GitHub

Prerequisites:

Windows 10 development PC.
FRDM-MCXN947 board
NXP’s MCUXpresso IDE is installed on the development PC
Generate and download FRDM-MCXN947 SDK package from the web-based MCUXpresso SDK builder.

Introduction to NPU

The MCXN94x and MCXN54x are based on dual high-performance Arm® Cortex®-M33 cores running at up to 150 MHz, it has 2MB of on-chip Flash with optional full ECC RAM and an integrated proprietary NPU. The integrated NPU delivers up to 40x faster machine learning (ML) throughput compared to a CPU core, enabling it to spend less time awake and reducing overall power consumption. The architecture provides power and performance-optimized NPUs integrated with NXP's very wide portfolio of microcontrollers and applications processors.

The eIQ Neutron NPUs offer support for a wide variety of neural network types such as CNN, RNN, TCN, and Transformer networks and more. ML application development with the eIQ Neutron NPU is fully supported by the eIQ machine learning software development environment. The NPU used in MCXN94 is Neutron N1-16, its block diagram is shown in the below figure.

The eIQ Neutron N1-16 NPU found inside the MCXN94 has 4 compute pipes and each compute pipe contains 4 INT8 MAC (Multiply Accumulate) blocks for a total of 16 MAC blocks. This means that MCXN94 could execute 4.8G(150MHz * 4 * 4 * 2) INT8 operations per second.

The MCUXpresso Software Development Kit (MCUXpresso SDK) provides a comprehensive software package with a pre-integrated TensorFlow Lite for Microcontrollers (TFLM).

The Neutron library is integrated into TFLM as well. The following table shows the operators which are supported by the NPU.

Operator	Operator input type	MCXN947/MCXN548 NPU
ADD	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
AVERAGE_POOL_2D	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
CONV_2D	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
DEPTHWISE_CONV_2D	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
FULLY_CONNECTED	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
UNIDIRECTIONAL_SEQUENCE_ LSTM	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	No
LOGISTIC (Sigmoid)	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
MAX_POOL_2D	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	Yes
MUL	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	No
SOFTMAX	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	No
SVDF	Float	No
	Uint8(PTQ)	No
	Int8(PCQ)	No

Note:

PTQ — Per-tensor quantized (asymmetric 8-bit quantization).
PCQ — Per-channel quantized (symmetric 8-bit quantization).

For more information please refer to eIQ TensorFlow Lite User's Guide.pdf in middleware/eiq/doc of SDK.

eIQ ML examples included in SDK.

Download SDK and select FRDM-MCXN947 in MCUxpresso SDK Builder, Remember to select eIQ middleware.

Then open MCUXpresso, it’s convenient to install the SDK by dragging and dropping the file into the SDK installation window of the MCUXpresso IDE.

Import SDK examples into the workspace:

There are 7 ML examples included in SDK:

Here are the descriptions of the eIQ examples:

eIQ example	Description	Hardware requirements
tflm_cifar10	CIFAR10 example based on TensorFlow Lite Micro, recognizes a static image	FRDM-MCXN947 USB type-c cable
tflm_kws	Keyword spotting example based on TensorFlow Lite Micro recognizes a static WAV audio	FRDM-MCXN947 USB type-c cable
tflm_label_image	Label 1000 classes of images based on TensorFlow Lite Micro	FRDM-MCXN947 USB type-c cable
mpp_camera_mobilenet_view_tflm	Label camera images based on TensorFlow Lite Micro	FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable
mpp_camera_ultraface_view_tflm	Face detection using the camera as the source, based on TensorFlow Lite Micro	FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable
mpp_camera_view	A simple camera preview pipeline.	FRDM-MCXN947 LCD: MikroElektronika TFT Proto 5" OV7670 module USB type-c cable
tflm_modelrunner	TFLite Model Benchmark example for Microcontrollers.	FRDM-MCXN947 RJ45 Network cable

For additional information and guidance, kindly refer to the README file located in the doc folder that accompanies each example.

Summary of related application notes

There are two application notes available that provide information on advanced usage of NPU.

How to Integrate Customer ML Model to NPU on MCXN94x
Face Detection demo with NPU accelerated on MCXN947.

Please find the notes on the NXP website

Summary of demo projects on GitHub

There are five demos on nxp-appcodehub from GitHub:

How to get started with NPU and ML in MCX Microcontrollers

How to get started with NPU and ML in MCX Microcontrollers

How to get started with NPU and ML in MCX Microcontrollers

MCXN

NPU|ML