ONNX quantised Model

mahanad · ‎05-20-2022

Hi,

How can I reduce the inference time for an onnx model.. It's taking roughly 6 seconds..

I tried to quantise the model using eiq toolkit but when I tried to load the model it's giving me the following error...

terminate called after throwing an instance of 'Ort::Exception'
what(): Fatal error: QLinearAdd is not a registered function/op
Aborted

Thanks in advance...

mahanad · ‎05-22-2022

Can it be used with onnx? According to the documentation, it's only for tflite (I might be wrong)

Zhiming_Liu · ‎05-22-2022

You need use vx delegrate in your inference code.

Please see this guide :