int8 Quantization for TFLite 32 model

BrunoSenzio · ‎05-15-2021

Hello,

I have tried to quantize my TF Lite model which was float32 to int8, but when I try to benchmark the int8 model on GPU/NPU I get the following error:

STARTING!
Duplicate flags: num_threads
Min num runs: [50]
Min runs duration (seconds): [1]
Max runs duration (seconds): [150]
Inter-run delay (seconds): [-1]
Num threads: [1]
Use caching: [0]
Benchmark name: []
Output prefix: []
Min warmup runs: [1]
Min warmup runs duration (seconds): [0.5]
Graph: [FishDetectModel_1k_int8.tflite]
Input layers: []
Input shapes: []
Input value ranges: []
Input layer values files: []
Allow fp16 : [0]
Require full delegation : [0]
Enable op profiling: [0]
Max profiling buffer entries: [1024]
CSV File to export profiling data to: []
Enable platform-wide tracing: [0]
#threads used for CPU inference: [1]
Max number of delegated partitions : [0]
Min nodes per partition : [0]
Loaded model FishDetectModel_1k_int8.tflite
ERROR: Didn't find op for builtin opcode 'CONV_2D' version '5'

ERROR: Registration failed.

Failed to initialize the interpreter
Benchmarking failed.

What could be the problem?

Thanks

Bio_TICFSL · ‎05-17-2021

Hello Bruno,

The saved model is incomplete. Can you share a saved model dir that we can load directly?, Also, It looks like Tensor flow version does not have the CONV_2D version '5'. You need to use a recent version of TF runtime or at least the same TF version, used for conversion in iMX8.

Regards

BrunoSenzio · ‎05-17-2021

Hello,

Yes, I can attach my pb model and my tflite files. When I run the float32 and float16 it works, I just get the prompt for int8.

Thanks in advance.

PD. I used version 2.4, I think it is the most recent one. Hope this works.