yolov7 onnx model too slow on NPU

hy982530 · ‎03-16-2023

Dear NXP,

I convert yolov7tiny.pt (yolov7-tiny model) to yolov7tiny.onnx with uint8 weights, and then run yolov7tiny.onnx on i.MX 8M Plus NPU. The input size is 224x224, but the npu inference time is 127 ms. It seems too slow. Is this time reasonable?

The following are my onnx model conversion steps and my onnxruntime execution code:

1. download yolov7-tiny.pt from https://github.com/WongKinYiu/yolov7/releases, and rename yolov7tiny.pt

2. convert yolov7tiny.pt to yolov7tiny.onnx (this onnx has fp32 weights)
(onnx==1.10.0 and opset=15)

$ git clone https://github.com/WongKinYiu/yolov7.git
$ python export.py --weights ./yolov7tiny.pt --img-size 224
Note: I modify some code in export.py in the attachment.

3. quantize yolov7tiny.onnx and the output is called yolov7tiny_uint8.onnx
Here I refer to https://github.com/microsoft/onnxruntime/issues/10787.

$ python quantize_yolo.py
Note: quantize_yolo.py in the attachment is the code I quantize onnx model.

4. run yolov7tiny_uint8.onnx on npu with onnxruntime_perf_test
$ /usr/bin/onnxruntime-1.10.0/onnxruntime_perf_test ./yolov7tiny_uint8.onnx -r 1 -e nnapi

the result:

I put my relevant files in the attachment.
Any help, much appreciated.

Sanket_Parekh · ‎03-20-2023

Hi @hy982530

I hope you are doing well.

Please make sure that you are using latest BSP LF5.15.71_2.2.0 as it has some fixes and improvements related to Yolo-V4-tiny and ONNX runtime.

Please refer to Chapter 5 ONNX Runtime and 5.2.3 ONNX performance test for in i.MX Machine Learning User's Guide

Thanks & Regards,

Sanket Parekh

hy982530 · ‎03-21-2023

Hi, @Sanket_Parekh

The version I'm working on is 5.15.71.

Can you provide the quantized .onnx or .tflite files of yolov3, yolov5 or yolov7 for my reference?

Or can you help me to test if my quantized onnx file is correct?

Thank you,

Best regards

yolov7 onnx model too slow on NPU

yolov7 onnx model too slow on NPU

i.MX 8M | i.MX 8M Mini | i.MX 8M Nano