Hey NXP,
I am trying to deploy a Yolov5m into IMX95 NPU, my kernel version is:
Linux version 6.12.20-lts-next-gdfaf2136deb2 (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.44) #1 SMP PREEMPT Wed Jun 4 10:15:09 UTC 2025
I used eIQ v1.16 to convert the model, but the inference result is totally different compared to the GPU. The inference pipeline is the same, which is modified from label_image.py.
GPU inference:
root@imx95-19x19-verdin:/usr/bin/tensorflow-lite-2.18.0/examples# python3 dms_demo.py -e /usr/lib/libneutron_delegate.so
Loading external delegate from /usr/lib/libneutron_delegate.so with args: {}
INFO: NeutronDelegate delegate: 0 nodes delegated out of 335 nodes with 0 partitions.
Error in cpuinfo: prctl(PR_SVE_GET_VL) failed
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Loaded 3 labels from dms.txt
Warm-up time: 641.2 ms
Processing: /usr/bin/tensorflow-lite-2.18.0/examples/1755631829_1.jpg
Inference time: 626.7 ms
[14 5]
Detections after NMS: 1
NPU inference:
root@imx95-19x19-verdin:/usr/bin/tensorflow-lite-2.18.0/examples# python3 dms_demo.py -e /usr/lib/libneutron_delegate.so
Loading external delegate from /usr/lib/libneutron_delegate.so with args: {}
INFO: NeutronDelegate delegate: 1 nodes delegated out of 61 nodes with 1 partitions.
INFO: Neutron delegate version: v1.0.0-a5d640e6, zerocp enabled.
Error in cpuinfo: prctl(PR_SVE_GET_VL) failed
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Loaded 3 labels from dms.txt
Warm-up time: 35.4 ms
Processing: /usr/bin/tensorflow-lite-2.18.0/examples/1755631829_1.jpg
Inference time: 34.8 ms
Detections after NMS: 1
The inference result (right NPU left GPU) :

Could you explain why the NPU generates more boxes?