Recommendations for ASR, TTS and Transformer

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Recommendations for ASR, TTS and Transformer

2,098 次查看
QuantumPath
Contributor I

Hello, I am developing an application on i.MX 93 EVK.  I would appreciate recommendations for the following which have relatively low compute requirements while running on-device.  I am using a i.MX 93 Dual core 1.7GHz.

I am looking for recommendations for

  • ASR engine with reasonable accuracy while running on-device.  Eg: Whisper tiny
  • TTS model
  • On-device transformer model such as Llama 3B

    Any information such as performance comparisons and recommendations would be appreciated.  Thank you
0 项奖励
回复
2 回复数

1,538 次查看
Laurent_P
NXP Employee
NXP Employee

Hello  @QuantumPath 

On i.MX93, we have enabled Whisper ASR (tiny, base, small) and Moonshine ASR (tiny and base).

We will deliver first the Whisper ASR as a Voice plugin through GStreamer by mid-July.

For TTS, we have enabled ViTS TTS. For LLM, we can run small LLM like Danube 0.5B.

In parallel, we have a complete eIQ Gen Al flow pipeline (Wake word, ASR, LLM, RAG, TTS) running on  i.MX95 here : https://github.com/nxp-appcodehub/dm-eiq-genai-flow-demonstrator?tab=readme-ov-file

 

388 次查看
boopathi12
Contributor I

Hey Laurent_P , can you share how you implemented the Whisper tiny.en TFLite model on the NPU of i.MX93? I’ve been looking for this for ages, and it would really help me in development. I was able to convert the model to TFLite INT8, but the NPU doesn’t fully support all Whisper operations, so I have to use the float32 model on the CPU. Is it even possible to convert it and use it on the NPU?  

thank you 

标记 (1)
0 项奖励
回复