Hi,
According reply from R&D, VSI NPU execution provider in ONNX support for quantized models is very limited (I would say none). And the Modelrunner only uses ONNX APIs too.
There is currently an ongoing migration from NNRT and nn-imx modules to TIM-VX and so as a part of this migration we expect the support for quantized ops. No estimates or dates for the delivery, it is a long-term plan roughly targeting the end of 2022.So, I advise ctms not to use onnxruntime.
Regards