Yolov5m inference results are different between GPU and NPU on IMX95

Tongshen

Hey NXP,

I am trying to deploy a Yolov5m into IMX95 NPU, my kernel version is:
Linux version 6.12.20-lts-next-gdfaf2136deb2 (oe-user@oe-host) (aarch64-poky-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.44) #1 SMP PREEMPT Wed Jun 4 10:15:09 UTC 2025

I used eIQ v1.16 to convert the model, but the inference result is totally different compared to the GPU. The inference pipeline is the same, which is modified from label_image.py.
GPU inference:

root@imx95-19x19-verdin:/usr/bin/tensorflow-lite-2.18.0/examples# python3 dms_demo.py -e /usr/lib/libneutron_delegate.so
Loading external delegate from /usr/lib/libneutron_delegate.so with args: {}
INFO: NeutronDelegate delegate: 0 nodes delegated out of 335 nodes with 0 partitions.

Error in cpuinfo: prctl(PR_SVE_GET_VL) failed
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Loaded 3 labels from dms.txt
Warm-up time: 641.2 ms

Processing: /usr/bin/tensorflow-lite-2.18.0/examples/1755631829_1.jpg
Inference time: 626.7 ms
[14 5]
Detections after NMS: 1

NPU inference:

root@imx95-19x19-verdin:/usr/bin/tensorflow-lite-2.18.0/examples# python3 dms_demo.py -e /usr/lib/libneutron_delegate.so
Loading external delegate from /usr/lib/libneutron_delegate.so with args: {}
INFO: NeutronDelegate delegate: 1 nodes delegated out of 61 nodes with 1 partitions.

INFO: Neutron delegate version: v1.0.0-a5d640e6, zerocp enabled.
Error in cpuinfo: prctl(PR_SVE_GET_VL) failed
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Loaded 3 labels from dms.txt
Warm-up time: 35.4 ms

Processing: /usr/bin/tensorflow-lite-2.18.0/examples/1755631829_1.jpg
Inference time: 34.8 ms
Detections after NMS: 1

The inference result (right NPU left GPU) :

Could you explain why the NPU generates more boxes?

Zhiming_Liu

Hi,

I did a test on L6.12.34, the model also converted from neutron converter (L6.12.34). Please consider update your BSP to L6.12.34. I will share model to you. Run neutron-converter.exe in C:\NXP\eIQ_Toolkit_v1.17.0\bin\neutron-converter\MCU_SDK_25.09.00+Linux_6.12.34_2.1.0/

The NPU result is same as CPU.

Best Regards,
Zhiming

Zhiming_Liu

Hi @Tongshen

Can you share the model you are using?

Best Regards,
Zhiming

Tongshen

Hey Zhiming,

I cannot provide the model here. Would you mind sharing your email here?