Custom Convolution Quantization to run on NPU

taklause · ‎10-21-2022

Hello, I ll try to convert certain networks to the NPU with the best possible performance.

Therefore I saw in the NXP IMX.8 ML Guide that the NPU provides faster inference for "per-tensor" quanized models.

I saw in some example (ex. Posenet) that all the conv. layers have been quantized Layerwise ,not Channel-wise as a Tensorflow Default.

I have yet not been able to quantize the convolution in a simple example in a Layer wise manner. Do you might have an example how to achieve this?

Thanks Daniel

taklause · ‎11-30-2022

It can be done by using the eIQ converter tool. Thanks

Bio_TICFSL · ‎10-24-2022

Hello,

Do you see that examples with the NPU? It appears that do not work with ARM architectures.

Regards