ONNX support on i.MX95 NPU

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

ONNX support on i.MX95 NPU

234 Views
1o_o1
Contributor I

Quantized onnx models is supposedly supported on the NPU on imx95 as of the latest release (LF6.12.20_2.0.0). Running both int8 and int4 onnx versions of Gemma3-1B on CPU provides expected results, while the NPU produces nothing but nonsense. Running quantized tflite models on the NPU on imx95 board requires conversion by the neutron converter specifying imx95 as a target. Does quantized onnx models need a conversion step before being runable on the NPU? The converter does not seem to support onnx. 

Tags (4)
0 Kudos
Reply
2 Replies

168 Views
danielchen
NXP TechSupport
NXP TechSupport

Hi @1o_o1 

 

Currently ONNX LLMs is not supported on Nertron NPU now.  We will support the ONNX Runtime for LLMs in Q3 BSP release.  Then you just need to specify the Neutron provider in the ONNX runtime API to deploy the supported Ops in LLM on Neutron NPU.

 

Regards

Daniel

150 Views
1o_o1
Contributor I

Okey, I see. A bit misleading to put it in the machine learning guide then. This means that no ONNX runtime models are supported on the NPU as of yet i would presume? 

0 Kudos
Reply