Which device is supported to run Quantized Gemma Model Inference

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Which device is supported to run Quantized Gemma Model Inference

1,551 Views
ramkumarkoppu_p
Contributor III

Hi, 

Out of i.MX RT700 and i.MX 95 devices, which device has full software support to run inference of Gen AI models like Quantized Google's Gemma Model — first in Python, then in C/C++ using the these devices NPU? Specifically:

  • Which device NPU support transformer-based architectures, or is it limited to CNNs?

  • Which inference frameworks are supported for GenAI on this e.IQ platform?

Tags (5)
0 Kudos
Reply
1 Reply

1,536 Views
ramkumarkoppu_p
Contributor III

especially has any of these devices NPU ported by NXP for llama.cpp ?

0 Kudos
Reply