iMX8M Plus: onnxruntime_perf_test is slower on NPU than CPU
Hi all,
I have the 8MPLUSLPD4-EVK Evaluation Kit and I am trying onnxruntime_perf_test according to "i.MX Machine Learning User's Guide, Rev. LF5.10.72_2.2.0, 17"
But onnxruntime_perf_test is slower on NPU than CPU.
i.MX Yocto Project(hardknott-5.10.72-2.2.0) is running on EVK.
Running on NPU
/usr/bin/onnxruntime-1.8.2/onnxruntime_perf_test /usr/bin/onnxruntime-1.8.2/squeezenet/model.onnx -r 1 -e vsi_npu
Session creation time cost: 0.126173 s
Total time cost (including warm-up): 1.1651 s
Total inference requests: 2
Warm-up inference time cost: 744.977 ms
Average inference time cost (excluding warm-up): 420.121 ms
Total inference run time: 0.420148 s
Avg CPU usage: 0 %
Peak working set size: 81121280 bytes
Running on CPU
/usr/bin/onnxruntime-1.8.2/onnxruntime_perf_test /usr/bin/onnxruntime-1.8.2/squeezenet/model.onnx -r 1 -e cpu
Session creation time cost: 0.0570905 s
Total time cost (including warm-up): 0.11501 s
Total inference requests: 2
Warm-up inference time cost: 58.0624 ms
Average inference time cost (excluding warm-up): 56.9481 ms
Total inference run time: 0.0569692 s
Avg CPU usage: 91 %
Peak working set size: 46661632 bytes
Is this correct?
I have attached onnxruntime_perf_test -v option log.