Smart Device Gateway

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Smart Device Gateway

Smart Device Gateway

Sure — here’s a short NXP Community post draft for Smart Device Gateway:


Building Voice-Enabled Smart Devices with Ara240 DNPU using Smart Device Gateway

In this post, I want to share a quick walkthrough of Smart Device Gateway, a FastAPI-based demo server that allows connected devices to use local GenAI capabilities accelerated by the Ara240 DNPU. [github.com]

The idea behind this demo is to centralize AI intelligence in one gateway instead of adding powerful AI hardware to every device. A connected device only needs a microphone, speaker, and network connection to become a voice-enabled assistant. [github.com]

What It Does

Smart Device Gateway enables devices such as appliances or embedded clients to send audio to a local server, process the request using speech recognition, RAG, an LLM running on Ara240 DNPU, and text-to-speech, then stream the spoken response back to the client. [github.com]

The current demo showcases intelligent device assistants for a generic oven and coffee machine/barista use case, using device manuals as the knowledge base for contextual responses. [github.com]

Key Features

  • FastAPI server for local GenAI-powered device interaction [github.com]
  • OpenAI Realtime API-compatible WebSocket endpoint at /v1/realtime [github.com]
  • Bidirectional audio streaming for low-latency voice conversations [github.com]
  • Speech-to-text powered by Moonshine [github.com]
  • RAG-based contextual answers using device-specific knowledge bases [github.com]
  • Qwen2.5-7B-Instruct running on Ara240 DNPU through the eIQ AAF Connector [github.com]
  • Text-to-speech using a VITS-based model [github.com]
  • Host PC and edge-client options for interacting with the gateway [github.com]

Architecture Overview

The Smart Device Gateway receives audio from a client over WebSocket, converts speech to text, retrieves relevant context from a device knowledge base, sends the prompt to the LLM through the eIQ AAF Connector, converts the generated response back to speech, and streams the audio response to the client. [github.com]

At a high level, the flow is:

Audio Input → STT → RAG → eIQ AAF Connector / LLM → TTS → Audio Output

The LLM runs on the Ara240 DNPU, while the server runs on the FRDM i.MX platform. [github.com]

Basic Server Installation

Install the Debian package on the board:

dpkg -i smart-device-gateway_1.0.0_all.deb

Start the server:

run_server_only --host 0.0.0.0 --port 8080

The server expects the eIQ AAF Connector to already be running on 0.0.0.0:8000 with Qwen2.5-7B-Instruct properly configured. [github.com]

Alternatively, the demo can start the server together with the connector:

run_server --host 0.0.0.0 --port 8080

[github.com]

Host PC Client Example

The demo includes a push_to_talk client that can run on a host PC. After copying the push_to_talk folder, run:

python -m uv run push_to_talk.py --server_ip <BOARD_IP> --port <PORT> --device oven

You can also use:

python -m uv run push_to_talk.py --server_ip <BOARD_IP> --port <PORT> --device barista

If no device name is provided, the RAG knowledge base is not used and the response is generated from the LLM’s general knowledge. [github.com]

Edge Client Example

The Debian package also includes an edge client for i.MX 8M and i.MX 9 boards. After creating a config.toml file, run:

run_client --config-file path/to/config.toml

The edge client supports wake-word interaction using “Hey NXP”, then sends the user query to the server and receives the streamed audio response. [github.com]

Walkthrough Video

In the attached video, I show how to start the Smart Device Gateway server, connect a client, select a device profile such as oven or barista, ask a voice question, and receive a spoken response generated locally using Ara240 DNPU acceleration.

Video:

(view in My Videos)

Notes and Considerations

  • Microphone quality has a direct impact on speech-to-text accuracy. [github.com]
  • The STT model can be sensitive to accents and pronunciation variations. [github.com]
  • RAG responses work best when questions are related to the selected device manual or knowledge base. [github.com]
  • This is a Proof of Concept version 1.0.0, intended to demonstrate centralized AI intelligence for smart devices. [github.com]

Summary

Smart Device Gateway demonstrates how everyday devices can become voice-enabled assistants by connecting to a local AI gateway. By combining STT, RAG, LLM inference on Ara240 DNPU, and TTS, the demo provides a practical reference for building local, privacy-focused GenAI experiences on NXP i.MX platforms. [github.com]

Link

Smart Device Gateway repository:
https://github.com/nxp-imx-support/smart-device-gateway

No ratings
Version history
Last update:
yesterday
Updated by: