Which device is supported to run Quantized Gemma Model Inference

ramkumarkoppu_p — Fri, 25 Apr 2025 10:41:20 GMT

Hi,

Out of i.MX RT700 and i.MX 95 devices, which device has full software support to run inference of Gen AI models like Quantized Google's Gemma Model — first in Python, then in C/C++ using the these devices NPU? Specifically:

Which device NPU support transformer-based architectures, or is it limited to CNNs?
Which inference frameworks are supported for GenAI on this e.IQ platform?

Re: Which device is supported to run Quantized Gemma Model Inference

ramkumarkoppu_p — Fri, 25 Apr 2025 14:41:26 GMT

especially has any of these devices NPU ported by NXP for llama.cpp ?

topic Re: Which device is supported to run Quantized Gemma Model Inference in i.MX Solutions

Which device is supported to run Quantized Gemma Model Inference

Re: Which device is supported to run Quantized Gemma Model Inference