Difference running a quantized model on cpu and npu.

JeFi · ‎07-09-2024

Dear all,

i am facing an issue with my IMX8MP. I trained a tflite model and quantized it for float fallback, so input and output are float32. Runtime is as expected but the accuracy is way worse on the npu than on the cpu. I did the quantisation with Tensorflow lite.

Used Operations:

CONV2D, DEPTHWISE_CONV2D, PRELU, PAD, MAX_POOL2D, ADD, BATCH_NORM and l2 regularisation

The base for our yocto image is the LTS kirkstone. I mean it is not completely off, but definitely worse than on cpu. I read in a thread that there are sometimes patches that need to be added, could it also be the problem here?

Best wishes

AldoG · ‎07-12-2024

Hello,

Could you share what community post are you refering to so I could take a look to it?

Best regards/Saludos,
Aldo.

JeFi · ‎07-15-2024

Sure, this is the thread. I think it could be due to the batch normalisation, because without batch normalisation it works way better.This is the post : https://community.nxp.com/t5/i-MX-Processors/NPU-versus-CPU-Results-and-Training-for-Tensorflow-lite...

best wishes,

JeFil

AldoG · ‎07-16-2024

Hello,

Thank you for sharing I will check for the availability of such patch, but as specified in that thread in order for me to provide the patch please create a support ticket, you may ask for me in the body of the ticket and please provide your model as well.

Best regards/Saludos,
Aldo.