tflite compatibility with NPU

AB22 · ‎10-25-2024

I am trying to run a YOLOv10 developed tflite model on the NPU of the i.MX8M Plus. The NPU appears not to be compatible with my quantized model as it has operations which use int64 typed data. I receive a number of warnings similar to below when I load the model:

WARNING: Fallback unsupported op 48 to TfLite
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported

I can successfully load the mobilenet_v1_1.0_224_quant.tflite model which is part of the label_image example suggesting the vx_delegate setup code is correct.

Is there a method for conversion of the YOLO model to an NPU compatible model? I have attempted to do the conversion with eIQ instead of onnx2tf without success. I only get errors when attempting conversion in eIQ.

Thank you

Zhiming_Liu · ‎10-27-2024

Hi @AB22
Can you share the onnx model exported from YOLOv10 project?

Best Regards
Zhiming

AB22 · ‎10-28-2024

Hi @Zhiming_Liu

When I try to attach the onnx file to this reply I get the following:

The file type (.onnx) is not supported. Valid file types are: 7z, avi, brd, bz2, c, diff, doc, docx, dts, dtsi, gif, gz, h, jpg, log, m4v, mkv, mp3, mp4, patch, pdf, png, ppt, pptx, py, rar, sh, tar.gz, tgz, trz, txt, xls, xlsx, zip, mov, mex, bin, exe, sdf.

I therefore attached it as a .zip file.

Thank you

Zhiming_Liu · ‎10-29-2024

Hello,

I tried to convert it in eIQ, but not success. It looks like has compatible issue with eIQ.

My suggestion is that you can export model with uint/int8 quantization model in YOLOv10 project because the NPU is primarily optimized for these two types of data.

Best Regards,
Zhiming

AB22 · ‎10-29-2024

Hi @Zhiming_Liu,

The error I reported is with int8 quantization activated during the export to tflite.

Zhiming_Liu · ‎10-29-2024

Hello,

Here i provide steps to use it.

1. YOLOv10 export

yolo export model=jameslahm/yolov10n format=onnx opset=13 simplify int8 data=coco8.yaml

2.ONNX->TFlite

Use onnx2tf or onnx2tflite, -ois 1,224,224,3 is random input data, you can use your dataset. Recommend you use onnx2tflite.

onnx2tf -i yolov10n.onnx -oiqt -qt per-channel -iqd uint8 -oqd uint8 -ois 1,224,224,3

3.BSP test log with L6.6.23

3.1 benchmark

root@imx8mpevk:/usr/bin/tensorflow-lite-2.15.0/examples# USE_GPU_INFERENCE=0  ./benchmark_model --graph=yolov10n_full_integer_quant.tflite  --external_delegate_path=/usr/lib/libvx_delegate.so               INFO: STARTING!
INFO: Log parameter values verbosely: [0]
INFO: Graph: [yolov10n_full_integer_quant.tflite]
INFO: External delegate path: [/usr/lib/libvx_delegate.so]
INFO: Loaded model yolov10n_full_integer_quant.tflite
INFO: Vx delegate: allowed_cache_mode set to 0.
INFO: Vx delegate: device num set to 0.
INFO: Vx delegate: allowed_builtin_code set to 0.
INFO: Vx delegate: error_during_init set to 0.
INFO: Vx delegate: error_during_prepare set to 0.
INFO: Vx delegate: error_during_invoke set to 0.
INFO: EXTERNAL delegate created.
WARNING: Fallback unsupported op 48 to TfLite
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 69 to TfLite
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
WARNING: Fallback unsupported op 69 to TfLite
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 48 to TfLite
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
INFO: Explicitly applied EXTERNAL delegate, and the model graph will be partially executed by the delegate w/ 4 delegate kernels.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
INFO: The input model file size (MB): 3.06509
INFO: Initialized session in 23.411ms.
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 56: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 56: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
INFO: count=1 curr=33702204

INFO: Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
INFO: count=50 first=112468 curr=112442 min=109822 max=113486 avg=112525 std=863

INFO: Inference timings in us: Init: 23411, First inference: 33702204, Warmup (avg): 3.37022e+07, Inference (avg): 112525
INFO: Note: as the benchmark tool itself affects memory footprint, the following is only APPROXIMATE to the actual memory footprint of the model at runtime. Take the information at your discretion.
INFO: Memory footprint delta from the start of the tool (MB): init=9.625 overall=503.359

3.2 label image

You can see that only a small amount of input data is unsupported, and this error may be caused by the tool onnx2tf. But I can only use onnx2tf at the moment.

NPU inference time: 112.122ms

CPU inference time: 531.9ms

root@imx8mpevk:/usr/bin/tensorflow-lite-2.15.0/examples# USE_GPU_INFERENCE=0 ./label_image -m yolov10n_full_integer_quant.tflite -i grace_hopper.bmp -l labels.txt  --external_delegate_path=/usr/lib/libvx_delegate.so
INFO: Loaded model yolov10n_full_integer_quant.tflite
INFO: resolved reporter
INFO: Vx delegate: allowed_cache_mode set to 0.
INFO: Vx delegate: device num set to 0.
INFO: Vx delegate: allowed_builtin_code set to 0.
INFO: Vx delegate: error_during_init set to 0.
INFO: Vx delegate: error_during_prepare set to 0.
INFO: Vx delegate: error_during_invoke set to 0.
INFO: EXTERNAL delegate created.
WARNING: Fallback unsupported op 48 to TfLite
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 69 to TfLite
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
WARNING: Fallback unsupported op 69 to TfLite
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 48 to TfLite
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 output is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
WARNING: Fallback unsupported op 95 to TfLite
ERROR: Int64 input is not supported
ERROR: Int64 input is not supported
INFO: Applied EXTERNAL delegate.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 56: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 56: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
warning at CreateOutputsTensor, #90
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
W [HandleLayoutInfer:322]Op 162: default layout inference pass.
INFO: invoked
INFO: average time: 112.122 ms
INFO: 0.00784314: 5 hammerhead
INFO: 0.00784314: 4 tiger shark
INFO: 0.00784314: 3 great white shark
INFO: 0.00784314: 2 goldfish
INFO: 0.00784314: 1 tench

CPU:

root@imx8mpevk:/usr/bin/tensorflow-lite-2.15.0/examples# ./label_image -m yolov10n_full_integer_quant.tflite -i grace_hopper.bmp -l labels.txt
INFO: Loaded model yolov10n_full_integer_quant.tflite
INFO: resolved reporter
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
ERROR: failed to get XNNPACK profile information.
INFO: invoked
INFO: average time: 531.9 ms
INFO: 0.00784314: 5 hammerhead
INFO: 0.00784314: 4 tiger shark
INFO: 0.00784314: 3 great white shark
INFO: 0.00784314: 2 goldfish
INFO: 0.00784314: 1 tench

The attachments are YOLOv10 models in test cases.

Best Regards,
Zhiming

AB22 · ‎10-30-2024

Hi @Zhiming_Liu ,

Thank you, I will try what you suggest.