ONNX support on i.MX95 NPU

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

ONNX support on i.MX95 NPU

236 次查看
1o_o1
Contributor I

Quantized onnx models is supposedly supported on the NPU on imx95 as of the latest release (LF6.12.20_2.0.0). Running both int8 and int4 onnx versions of Gemma3-1B on CPU provides expected results, while the NPU produces nothing but nonsense. Running quantized tflite models on the NPU on imx95 board requires conversion by the neutron converter specifying imx95 as a target. Does quantized onnx models need a conversion step before being runable on the NPU? The converter does not seem to support onnx. 

标记 (4)
0 项奖励
回复
2 回复数

170 次查看
danielchen
NXP TechSupport
NXP TechSupport

Hi @1o_o1 

 

Currently ONNX LLMs is not supported on Nertron NPU now.  We will support the ONNX Runtime for LLMs in Q3 BSP release.  Then you just need to specify the Neutron provider in the ONNX runtime API to deploy the supported Ops in LLM on Neutron NPU.

 

Regards

Daniel

152 次查看
1o_o1
Contributor I

Okey, I see. A bit misleading to put it in the machine learning guide then. This means that no ONNX runtime models are supported on the NPU as of yet i would presume? 

0 项奖励
回复