yolov7 onnx model too slow on NPU

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

yolov7 onnx model too slow on NPU

1,392 Views
hy982530
Contributor I

Dear NXP,

I convert yolov7tiny.pt (yolov7-tiny model) to yolov7tiny.onnx with uint8 weights, and then run yolov7tiny.onnx on i.MX 8M Plus NPU. The input size is 224x224, but the npu inference time is 127 ms. It seems too slow. Is this time reasonable?


The following are my onnx model conversion steps and my onnxruntime execution code: 

1. download yolov7-tiny.pt from https://github.com/WongKinYiu/yolov7/releases, and rename yolov7tiny.pt

2. convert yolov7tiny.pt to yolov7tiny.onnx (this onnx has fp32 weights)
    (onnx==1.10.0 and opset=15)

$ git clone https://github.com/WongKinYiu/yolov7.git
$ python export.py --weights ./yolov7tiny.pt --img-size 224
Note: I modify some code in export.py in the attachment.

3. quantize yolov7tiny.onnx and the output is called yolov7tiny_uint8.onnx
Here I refer to https://github.com/microsoft/onnxruntime/issues/10787.

$ python quantize_yolo.py
Note:  quantize_yolo.py in the attachment is the code I quantize onnx model.

4. run yolov7tiny_uint8.onnx on npu with onnxruntime_perf_test
$ /usr/bin/onnxruntime-1.10.0/onnxruntime_perf_test ./yolov7tiny_uint8.onnx -r 1 -e nnapi

the result: 

hy982530_0-1678957846873.png

 

I put my relevant files in the attachment.
Any help, much appreciated.

Labels (1)
0 Kudos
2 Replies

1,358 Views
Sanket_Parekh
NXP TechSupport
NXP TechSupport

Hi @hy982530 

I hope you are doing well.
 
Please make sure that you are using latest BSP LF5.15.71_2.2.0 as it has some fixes and improvements related to Yolo-V4-tiny and ONNX runtime.
 
Please refer to Chapter 5 ONNX Runtime and 5.2.3 ONNX performance test for in i.MX Machine Learning User's Guide
 
Thanks & Regards,
Sanket Parekh
0 Kudos

1,339 Views
hy982530
Contributor I

Hi, @Sanket_Parekh 

The version I'm working on is 5.15.71.

Can you provide the quantized .onnx or .tflite files of yolov3, yolov5 or yolov7 for my reference?

Or can you help me to test if my quantized onnx file is correct?

Thank you, 

Best regards 

 

0 Kudos