Hello, I ll try to convert certain networks to the NPU with the best possible performance.
Therefore I saw in the NXP IMX.8 ML Guide that the NPU provides faster inference for "per-tensor" quanized models.
I saw in some example (ex. Posenet) that all the conv. layers have been quantized Layerwise ,not Channel-wise as a Tensorflow Default.
I have yet not been able to quantize the convolution in a simple example in a Layer wise manner. Do you might have an example how to achieve this?
Thanks Daniel
It can be done by using the eIQ converter tool. Thanks
Hello,
Do you see that examples with the NPU? It appears that do not work with ARM architectures.
Regards