Text decoder on i.MX8 plus

donnadamus · ‎12-09-2024

Hello everyone,

I was wondering whether or not it is feasible to deploy text decoders based on attention mechanisms on the MX8 plus.

Does the NPU support those layers and operations?

Is it feasible, alternatively, to deploy it on the cpu?

We are talking about a text decoder of 60M params

Thank you in advance

Zhiming_Liu · ‎12-10-2024

Hello,

We verified such model with 1.1 billion parameters on i.MX8MPlus.

For more detail, please refer this WHITEPAPER

https://www.nxp.com/webapp/Download?colCode=GEN-AI-RAG-WHITEPAPER

Best Regards,
Zhiming

donnadamus · ‎12-10-2024

Hi Zhiming,

thank you for you answer. I will read the paper.

One question: if you were able to deploy such models on the i.MX 8M Plus, then why here https://www.nxp.com/docs/en/user-guide/IMX-MACHINE-LEARNING-UG.pdf (Chapter 11) I can't seem to find support for MultiHeadAttention layer?

Maybe I'm not looking in the right place?

Thank you in advance for your time.

Kind Regards,

Marco Donnarumma

Zhiming_Liu · ‎12-11-2024

Hello,

The LLM project and LMM finetune tool with eIQ is not released. NXP will release the eIQ that support deploying LLM model.

Best Regards,
Zhiming

donnadamus · ‎12-12-2024

Hello,

do we know if the release will happen in the near future?

Thank you in advance.

Marco

Zhiming_Liu · ‎12-12-2024

Hello,

The demo demo is expected to be released at the end of 2025Q1, and then later for eIQ, the actual release date depends on the project schedule.

Best Regards,
Zhiming

Text decoder on i.MX8 plus

Text decoder on i.MX8 plus

i.MX 8 Family | i.MX 8QuadMax (8QM) | 8QuadPlus

i.MX 8M | i.MX 8M Mini | i.MX 8M Nano