Model tensorflow 1 can't run on NPU iMX8MP and fallback to CPU

The_Nguyen · ‎04-07-2025

Hi, NXP supporter,

I'm tried running model mobilenet-v1 with tensorflow 1.15 and quantized with eIQ-tool. However, when I run on NPU it gets fallback error to CPU but still recognizes the object and I want it to run on NPU. I tried with exporting tflite_graph.pb to saved_model and quantization with tensorflow 2.16.2 environment. I used the following method to quant but it still doesn't work, how can I run it completely on NPU without falling back to CPU for processing.

def representative_data_gen():

dataset_list = quant_image_list

quant_num = 500

for i in range(quant_num

pick_me = random.choice(dataset_list)

image = tf.io.read_file(pick_me)

if pick_me.endswith('.png') or pick_me.endswith('.PNG'

image = tf.io.decode_jpeg(image, channels=3)

image = tf.image.resize(image, [width, height])

image = tf.cast(image / 255., tf.float32)

image = tf.expand_dims(image, 0)

yield [image]

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model)

converter.optimizations = [tf.lite.Optimize.DEFAULT]

converter.representative_dataset = representative_data_gen

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.target_spec.supported_types = [tf.int8]

converter.inference_input_type = tf.int8

converter.inference_output_type = tf.int8

tflite_model = converter.convert()

Below is my email and I need help with this issue: the.nguyen@rynantech.com

Manuel_Salas · ‎04-07-2025

Hello @The_Nguyen

I hope you are doing very well.

Please refer to chapter 2.7.1 Running the example on the i.MX 8 platform hardware accelerator of i.MX Machine Learning User's Guide.

There is described how you can run a mobilnet with hardware acceleration GPU/NPU.

Best regards,

Salas.

The_Nguyen · ‎04-07-2025

Hello Salas,

I followed the instructions in the document in 2.7.1, I was able to run the example model, but for the mobilenet-v1 model, I quantized it with eIQ-tool and it could not run completely on the NPU. So the reason is that the model has not been fully quantized to int8 or is my configuration incorrect.

I read in the document that in iMX8MP, it is required to be fully quantized to int8 to be able to run completely on the NPU without falling back to the CPU, is that correct?

Thanks and Regards,

TheNguyen