Hello everyone,
I’m looking for answers on deploying a model to the FRDM-MCXN947 board. I have a model already trained in TensorFlow Lite (float32). I used the eIQ environment to convert it to C source, and that worked well. However, I understand that the NPU only accepts int8 models—is that correct?
When I converted my model to TensorFlow Lite int8, I encountered an error during conversion, so it didn’t work. I tried manually adding quantization and dequantization nodes, but that also failed.
From the examples I’ve seen, when you use your dataset in eIQ to generate the model, it automatically adds quantization and dequantization nodes.
Finally, I used a Python script to convert my TensorFlow Lite model to int8 C source, and that worked, but the model’s output differs from my TensorFlow Lite int8 tests in Python.
I’d like to know: is there a way to convert a model to int8 using eIQ, or to add quantization/dequantization nodes?
Thank you very much.
已解决! 转到解答。
Hello @Abdu_,
I successfully exported a TensorFlow INT8 model as a header file. This model was trained outside the eIQ environment and does not include quantization or dequantization nodes
Please use these params in the custom options:
dump-header-file-output; dump-header-file-input
In the other hand, in this page are more information about eIQ that could be helpful.
BR
Habib
Hello @Abdu_ ,
To answer your first question: the NPU currently supports only INT8 models. For more details, I recommend checking out this community post which provides further clarification.
Regarding your second question: the eIQ Toolkit includes a model conversion feature, as illustrated in the image below. This allows you to convert models into formats compatible with the supported hardware:
For a deeper understanding, please refer to Chapter 4.2: "Model Conversion" in the eIQ Toolkit User Guide, available on this page.
Additionally, you may find helpful resources on the official Google documentation, especially if you're working with TensorFlow Lite or other Google-supported frameworks.
BR
Habib
Thank you, Habib, for your answer.
The problem is that elQ is unable to convert a TensorFlow Lite (float32) model into an INT8 C file. I therefore provided an INT8 TensorFlow Lite model and waited for the corresponding INT8 C file, but I encountered the error shown on the screen.
I would also like to know how to insert quantize and dequantize nodes into a float32 model. I’ve reviewed all the documentation but couldn’t find any guidance on this.
Thank you very much.
Hello @Abdu_,
a) I am able to convert a FLoat32 model to INT8.tflite model in the eIQ Toolkit version 1.15.1.104. To assist you more effectively, could you please share the specific steps you followed that led to the error you encountered?
b) The most relevant documentation I found regarding Quantization is in Chapter 3.10 of the eIQ Toolkit User Guide:
"You can quantize a trained model to reduce its size and speed up the inference time on
different hardware accelerators (for example, GPU and NPU) with a minimal accuracy
loss. You can choose between the per channel and per tensor quantizations. The per
tensor quantization means that all the values within a tensor are scaled in the same way.
The per channel quantization means that tensor values are separately scaled for each
channel (for example, the convolution filter is scaled separately for each filter).""
I highly recommend reviewing this chapter to better understand how to implement quantization effectively.
BR
Habib
Hello Habib,
Thank you for your response.
a) Here are the steps I followed:
I developed my model in Google Colab, then converted it to TensorFlow Lite. I imported the .tflite file into eIQ and selected the Model Tool. I opened the model there, then clicked on Convert: TensorFlow Lite for Neutron. I selected my board (MCXN-947) and enabled Dump header file to generate a header file for use with the NPU.
However, when I clicked Convert, I encountered the error I showed you earlier. Attached is a screenshot of my model.
b) I also read Chapter 3.10 of the eIQ Toolkit User Guide. From what I understand, it mainly discusses models developed directly on the platform using imported datasets. It seems that when you bring a pre-trained TensorFlow Lite model from outside, the only available option is to convert it — you can't customize it like you can with models trained from raw data on the platform. That's the limitation I noticed.
thank you so much
Hello @Abdu_,
I followed the steps outlined in Chapter 3, titled "Label Image Example", from the guide Lab eIQ Neutron NPU for MCX N Lab Guide - Part 1 - Mobilenet - MCUXpresso SDK Builder. As a result, I was able to successfully export a dump header model using eIQ even with a model that was trained outside of the eIQ environment
Please review the steps and let me know if they were helpful. I also strongly recommend downloading the latest version of eIQ to ensure compatibility and avoid potential issues.
BR
Habib
I followed the example and successfully converted the TensorFlow Lite model (float32) to a file header (float32). However, since the NPU only supports INT8 models, I converted the TensorFlow Lite model to INT8 and encountered an error.
I noticed that when you load the trained TensorFlow Lite (float32) model into the environment, it doesn’t automatically insert the quantization and dequantization nodes. In contrast, when you train the model within the environment, those nodes are added, as shown in the example.
Thank you so much.
Hello @Abdu_,
I successfully exported a TensorFlow INT8 model as a header file. This model was trained outside the eIQ environment and does not include quantization or dequantization nodes
Please use these params in the custom options:
dump-header-file-output; dump-header-file-input
In the other hand, in this page are more information about eIQ that could be helpful.
BR
Habib