int8 Quantization for TFLite 32 model

显示  仅  | 搜索替代 

int8 Quantization for TFLite 32 model

1,395 次查看
Contributor III


I have tried to quantize my TF Lite model which was float32 to int8, but when I try to benchmark the int8 model on GPU/NPU I get the following error:

Duplicate flags: num_threads
Min num runs: [50]
Min runs duration (seconds): [1]
Max runs duration (seconds): [150]
Inter-run delay (seconds): [-1]
Num threads: [1]
Use caching: [0]
Benchmark name: []
Output prefix: []
Min warmup runs: [1]
Min warmup runs duration (seconds): [0.5]
Graph: [FishDetectModel_1k_int8.tflite]
Input layers: []
Input shapes: []
Input value ranges: []
Input layer values files: []
Allow fp16 : [0]
Require full delegation : [0]
Enable op profiling: [0]
Max profiling buffer entries: [1024]
CSV File to export profiling data to: []
Enable platform-wide tracing: [0]
#threads used for CPU inference: [1]
Max number of delegated partitions : [0]
Min nodes per partition : [0]
Loaded model FishDetectModel_1k_int8.tflite
ERROR: Didn't find op for builtin opcode 'CONV_2D' version '5'

ERROR: Registration failed.

Failed to initialize the interpreter
Benchmarking failed.


What could be the problem?


0 项奖励
2 回复数

1,382 次查看
NXP TechSupport
NXP TechSupport

Hello Bruno,

The saved model is incomplete. Can you share a saved model dir that we can load directly?, Also, It looks like Tensor flow version does not have the CONV_2D version '5'. You need to use a recent version of TF runtime or at least the same TF version, used for conversion in iMX8.




0 项奖励

1,378 次查看
Contributor III



Yes, I can attach my pb model and my tflite files. When I run the float32 and float16 it works, I just get the prompt for int8.


Thanks in advance.


PD. I used version 2.4, I think it is the most recent one. Hope this works.