Hi,
Out of i.MX RT700 and i.MX 95 devices, which device has full software support to run inference of Gen AI models like Quantized Google's Gemma Model — first in Python, then in C/C++ using the these devices NPU? Specifically:
Which device NPU support transformer-based architectures, or is it limited to CNNs?
Which inference frameworks are supported for GenAI on this e.IQ platform?
especially has any of these devices NPU ported by NXP for llama.cpp ?