very slow inference with pyEIQ with ONNX parser

korabelnikov · ‎12-07-2020

Hi,

Previously I used armnn 20.x built myself. I converted ONNX to armmn first. Then I measured time and it was 80ms-300ms in depending on model and input size. Backend was 'CpuAcc'

Now I'm trying to use pyEIQ from BSP and get time ~3 seconds with 'VsiNpu' and ~1.5 second with 'CpuAcc' on middle model.

I found it strange. Any ideas why is it and how to solve it?

(I'm not sure but I remember that I run my models with 'VsiNpu' and there was a significant gain in inference speed. I wrote down time but forget way I got it)

I suppose issue may be with ONNXParser, but I can't check .armnn model directly cause libarmnnSerializer.so hasn't built.

so few more questions:

1) how load .armnn model?

2) what armnn version and patches you uses. I'm think it's not 19.08, cause pyarmmn started from 20.x version

Thank in advance

korabelnikov · ‎12-08-2020

upd.

The maximum speed was when I used quantized tflite models.

As instance, resnet 18 takes ~0.5 seconds on tflite and VsiNpu but on armnn and VsiNpu take several seconds.

So, can such huge performance difference be because of armnn engine?