The RT speech recognition system based on VIT to obtain weather information

1 Kudo

1. Abstract

NXP EdgeReady solution can use RT106/5 S/L/A/F to achieve speech recognition, but the relevant support software libraries for the RT4-bit series are limited to the S/L/A/F series, if you want to use normal RT chips, how to achieve speech recognition functions? NXP officially launched the VIT software package in the SDK, which can support RT1060, RT1160, RT1170, RT600, RT500 to achieve SDK-based speech recognition functions.

For the acquisition of weather information, usually customer can connect with a third-party platform or the cloud weather API, using http client method to access directly, the current weather API platforms, you can register it, then call the API directly, so you can use the RT SDK lwip socket client method to call the corresponding weather API, to achieve real-time specific geographical location weather forecast data.

This article will use MIMXRT1060-EVK to implement customer-defined wake-up word(WW) and voice recognition word recognition(VC) based on SDK VIT lib, and LWIP socket client to achieve real-time weather information acquisition in Shanghai, then print it to the terminal, this article mainly use the print to share the weather information, for the sound broadcasts, it also add the simple method to broadcast the fixed sound with mp3 audio data, but for the freely sound broadcast, it may need to use real-time TTS function, which is not added now.

The system block diagram of this document is as follows:

Fig 1 System Block diagram

The VIT custom wake-up word of this system is "小恩小恩", and after waking up, one of the following recognition words can be recognized: ”开灯”("Turn on the lights")，“关灯”("Turn off the lights")，”今天天气”("Today's weather")，“明天天气”("Tomorrow's Weather")，“后天天气”("The day after tomorrow's weather"). Turn on the light or Turn off the lights , that is to control the external LED red light on the EVK board. ”今天天气” gets today’s weather forecast, it is in the following format:

"date": "2022-05-27",

"week": "5",

"dayweather": "阴",

"nightweather": "阴",

"daytemp": "28",

"nighttemp": "21",

"daywind": "东南",

"nightwind": "东南",

"daypower": "≤3",

"nightpower": "≤3"

“明天天气”，“后天天气” are the same format, but it is 1-2 days after the date of today. To get the weather data, the MIMXRT1060-EVK board needs to connect the network to achieve the acquisition of the Gaode Map(restapi.amap.com) Weather API data.

2. Related preparations

2.1 Weather API Platform

At present, there are many third-party platforms that can obtain weather on the Internet for Chinese, such as: Baidu Intelligent Cloud, Baidu Map API, Huawei cloud platform, Juhe weather, Gaode Map API, and so on. This article tried several platform, the test results found: Baidu intelligent cloud, the number of daily free calls is small, the need for real-time synthesis of AK, SK, cumbersome to call; Baidu Map API needs to upload ID card information; Several others have a similar situation. In the end, the Gaode Map API with convenient registration, many daily calls and relatively full feedback weather data information was selected.

Here, we mainly talk about the Gaode Map API usage, the link is:

https://lbs.amap.com/api/webservice/guide/api/weatherinfo

Create the account and the API key, then add the relevant parameters to implement the call of the weather API, the application for API Key is as follows:

Fig 2 Gaode map API key

The following diagram shows the call volume:

Fig 3 Gaode Map API call volume

This is the API calling format:

Fig 4 Weather API calling parameters

So, the full Gaode Map API link should like this:

https://restapi.amap.com/v3/weather/weatherInfo?key=xxxxxxx&city=xxx&extensions=all&output=JSON

If need to test the Shanghai weather, city code is 310000.

2.2 Postman test weather API

Postman is an interface testing tool, when doing interface testing, Postman is equivalent to a client, it can simulate various HTTP requests initiated by users, send the request data to the server, obtain the corresponding response results, and verify whether the result data in the response matches the expected value. Postman download link: https://www.postman.com/

After finding the proper weather API platform and the calling link, use the postman do the http GET operation to capture the weather data, refer to the Fig 4, fill the related parameters to the postman:

Fig 5 Postman call weather API

Send Get command, we can find the weather information in the position 7, the complete all information is:

{

"status": "1",

"count": "1",

"info": "OK",

"infocode": "10000",

"forecasts": [

{

"city": "上海市",

"adcode": "310000",

"province": "上海",

"reporttime": "2022-05-27 17:34:12",

"casts": [

{

"date": "2022-05-27",

"week": "5",

"dayweather": "阴",

"nightweather": "阴",

"daytemp": "28",

"nighttemp": "21",

"daywind": "东南",

"nightwind": "东南",

"daypower": "≤3",

"nightpower": "≤3"

},

{

"date": "2022-05-28",

"week": "6",

"dayweather": "小雨",

"nightweather": "中雨",

"daytemp": "24",

"nighttemp": "20",

"daywind": "东南",

"nightwind": "东南",

"daypower": "≤3",

"nightpower": "≤3"

},

{

"date": "2022-05-29",

"week": "7",

"dayweather": "大雨",

"nightweather": "小雨",

"daytemp": "23",

"nighttemp": "20",

"daywind": "南",

"nightwind": "南",

"daypower": "≤3",

"nightpower": "≤3"

},

{

"date": "2022-05-30",

"week": "1",

"dayweather": "小雨",

"nightweather": "晴",

"daytemp": "27",

"nighttemp": "20",

"daywind": "北",

"nightwind": "北",

"daypower": "≤3",

"nightpower": "≤3"

}

]

}

]

}

We can see, it can capture the continuous 4 days information, with this information, we can get the weather information easily.

From the postman, we also can see the Get code, like this:

Fig 6 postman API HTTP code

With this API which already passed the testing, it can capture the complete weather information, here, we can consider adding the working http API to the MIMXRT1060-EVK code.

2.3 VIT custom commands

From the maestro code of the RT1060 SDK, we can know that the SDK already supports the VIT library, what is VIT?

VIT's full name: Voice Intelligent Technology, the library provides voice recognition services designed to wake up and recognize specific commands, control IOT, and the smart home.

Fig 7 VIT system block diagram

In NXP RT1060 SDK code, the generated wake word and command word have been provided and placed in the VIT_Model.h file. If in the customer's project, how to customize the wake word and command word? With the NXP's efforts, we have made a web page form for customers to choose their own command, and then generate the corresponding VIT_Model.h file for code to call. VIT command word generation web page is:

https://vit.nxp.com/#/home

Login the NXP account, choose the RT chip partn umber, wakeup word, voice command. Please note, the current supported RT chip is:

RT1060,RT1160,RT1170,RT600,RT500

The following is the example for generating wakeup word and voice command:

Fig 8 Custom VIT configuration

Fig 9 generated result

Download the generated model, you can get VIT_Model_cn.h, open to see the command word information and related model data stored in the const PL_MEM_ALIGN (PL_UINT8 VIT_Model_cn[], VIT_MODEL_ALIGN_BYTES) array, the command word information is as follows:

WakeWord supported : " 小恩小恩 "

Voice Commands supported

Cmd_Id : Cmd_Name

0 : UNKNOWN

1 : 开灯

2 : 关灯

3 : 今天天气

4 : 明天天气

5 : 后天天气

Use the RT1060 SDK maestro_record demo to test this custom command result:

Fig 10 Custom Wakeup word and voice command test

From the test result, we can see, both the wakeup word and voice command is detected.

3 Software code

3.1 LWIP socket client code capture weather API

From chapter 2.2, we have been able to obtain the weather API and through testing, we can successfully achieve weather acquisition, so we need to add relevant commands in combination with the needs of our own system. For the acquisition of the weather API, the lwip code based on the RT1060 SDK is in the form of socket client. The relevant code is as follows:

#define PORT            80
#define IP_ADDR        "59.82.9.133"
uint8_t get_weather[]= "GET /v3/weather/weatherInfo?key=xxx&city=310000&extensions=all&output=JSON HTTP/1.1\r\nHost: restapi.amap.com\r\n\r\n\r\n\r\n";
    if (sys_thread_new("weather_main", weathermain_thread, NULL, HTTPD_STACKSIZE, HTTPD_PRIORITY) == NULL)
        LWIP_ASSERT("main(): Task creation failed.", 0);
static void weathermain_thread(void *arg)
{
    static struct netif netif;
    ip4_addr_t netif_ipaddr, netif_netmask, netif_gw;
    ethernetif_config_t enet_config = {
        .phyHandle  = &phyHandle,
        .macAddress = configMAC_ADDR,
    };
    LWIP_UNUSED_ARG(arg);
    mdioHandle.resource.csrClock_Hz = EXAMPLE_CLOCK_FREQ;
    IP4_ADDR(&netif_ipaddr, configIP_ADDR0, configIP_ADDR1, configIP_ADDR2, configIP_ADDR3);
    IP4_ADDR(&netif_netmask, configNET_MASK0, configNET_MASK1, configNET_MASK2, configNET_MASK3);
    IP4_ADDR(&netif_gw, configGW_ADDR0, configGW_ADDR1, configGW_ADDR2, configGW_ADDR3);
    tcpip_init(NULL, NULL);
    netifapi_netif_add(&netif, &netif_ipaddr, &netif_netmask, &netif_gw, &enet_config, EXAMPLE_NETIF_INIT_FN,
                       tcpip_input);
    netifapi_netif_set_default(&netif);
    netifapi_netif_set_up(&netif);
    PRINTF("\r\n************************************************\r\n");
    PRINTF(" TCP client example\r\n");
    PRINTF("************************************************\r\n");
    PRINTF(" IPv4 Address     : %u.%u.%u.%u\r\n", ((u8_t *)&netif_ipaddr)[0], ((u8_t *)&netif_ipaddr)[1],
           ((u8_t *)&netif_ipaddr)[2], ((u8_t *)&netif_ipaddr)[3]);
    PRINTF(" IPv4 Subnet mask : %u.%u.%u.%u\r\n", ((u8_t *)&netif_netmask)[0], ((u8_t *)&netif_netmask)[1],
           ((u8_t *)&netif_netmask)[2], ((u8_t *)&netif_netmask)[3]);
    PRINTF(" IPv4 Gateway     : %u.%u.%u.%u\r\n", ((u8_t *)&netif_gw)[0], ((u8_t *)&netif_gw)[1],
           ((u8_t *)&netif_gw)[2], ((u8_t *)&netif_gw)[3]);
    PRINTF("************************************************\r\n");
    sys_thread_new("weather", weather_thread, NULL, DEFAULT_THREAD_STACKSIZE, DEFAULT_THREAD_PRIO);
    vTaskDelete(NULL);
}
static void weather_thread(void *arg)
{
	  int sock = -1,rece;
	  struct sockaddr_in client_addr;
	  char* host_ip;
	  ip4_addr_t dns_ip;
	  err_t err;
	  uint32_t *pSDRAM= pvPortMalloc(BUF_LEN);//
	    host_ip = HOST_NAME;
	    PRINTF("host name : %s , host_ip : %s\r\n",HOST_NAME,host_ip);
	  while(1)
	  {
	       sock = socket(AF_INET, SOCK_STREAM, 0);
	       if (sock < 0)
	       {
	         PRINTF("Socket error\n");
	         vTaskDelay(10);
	         continue;
	        }
	       client_addr.sin_family = AF_INET;
	       client_addr.sin_port = htons(PORT);
	       client_addr.sin_addr.s_addr = inet_addr(host_ip);
	       memset(&(client_addr.sin_zero), 0, sizeof(client_addr.sin_zero));
	       if (connect(sock, (struct sockaddr *)&client_addr,  sizeof(struct sockaddr)) == -1)
	       {
	          PRINTF("Connect failed!\n");
	          closesocket(sock);
	          vTaskDelay(10);
	          continue;
	       }
	       PRINTF("Connect to server successful!\r\n");
	       write(sock,get_weather,sizeof(get_weather));
	       while (1)
	       {
	          rece = recv(sock, (uint8_t*)pSDRAM, BUF_LEN, 0);//BUF_LEN
	          if (rece <= 0)
	            break;
               memcpy(weather_data.weather_info, pSDRAM,1500);//max 1457
	       }
	       Weather_process();
	       memset(pSDRAM,0,BUF_LEN);
	       closesocket(sock);
	    vTaskDelay(10000);
	  }
}

3.2 VIT detect customer command code

Put the generated VIT_Model_cn.h to the maestro_record folder path:

vit\RT1060_CortexM7\Lib

The specific wake word and voice command related code can be viewed from the code vit_pro.c, mainly involving function is:

int VIT_Execute(void *arg, void *inputBuffer, int size)

The code is modified as follows, mainly to record the wake and wake word number, for specific function control, the command directly controlled here is the local "开灯:turn on the light", "关灯:turn off the light" command, as for the weather command needs to call the socket client API, so in the main lwip call area combined with the command word recognition number to call:

if (VIT_DetectionResults == VIT_WW_DETECTED)
    {
        PRINTF(" - WakeWord detected \r\n");
        weather_data.ww_flag =  1; //kerry
    }
    else if (VIT_DetectionResults == VIT_VC_DETECTED)
    {
        // Retrieve id of the Voice Command detected
        // String of the Command can also be retrieved (when WW and CMDs strings are integrated in Model)
        VIT_Status = VIT_GetVoiceCommandFound(VITHandle, &VoiceCommand);
        if (VIT_Status != VIT_SUCCESS)
        {
            PRINTF("VIT_GetVoiceCommandFound error: %d\r\n", VIT_Status);
            return VIT_Status; // will stop processing VIT and go directly to MEM free
        }
        else
        {
            PRINTF(" - Voice Command detected %d", VoiceCommand.Cmd_Id);
            weather_data.vc_index = VoiceCommand.Cmd_Id;//kerry 1:ledon 2:ledoff 3:today weather 4:tomorrow weather 5:aftertomorrow weather
            if(weather_data.vc_index == 1)//1
            {
        		GPIO_PinWrite(GPIO1, 3, 1U); //pull high
        		PRINTF(" led on!\r\n");
            }
            else if(weather_data.vc_index == 2)//2
            {
        		GPIO_PinWrite(GPIO1, 3, 0U); //pull low
        		PRINTF(" led off!\r\n");
            }
            // Retrieve CMD Name: OPTIONAL
            // Check first if CMD string is present
            if (VoiceCommand.pCmd_Name != PL_NULL)
            {
                PRINTF(" %s\r\n", VoiceCommand.pCmd_Name);
            }
            else
            {
                PRINTF("\r\n");
            }
        }
    }

3.3 Voice recognize weather information

In the weather_thread while, check the wakeup word and voice command, if meet the requirement, then create the socket connection, write the API and capture the weather data.

The related code is:

 while(1)
{
	  //add the command request, only cmd == weather flag, then call it.
        if((weather_data.ww_flag ==  1))
        {
    	    if(weather_data.vc_index >= 3)
    	   {
		   // create connection
                //write API and read API
	       Weather_process();
    	    }
	     memset(weather_data.weather_info, 0, sizeof(weather_data.weather_info));
 	     weather_data.ww_flag = 0;
 	     weather_data.vc_index = 0;
        }
	    vTaskDelay(10000);
}
void Weather_process(void)
{
 char * datap, *datap1;
     datap = strstr((char*)weather_data.weather_info,"date");
     if(datap != NULL)
     {
    	 memcpy(today_weather, datap,184);//max 1457
    	 if(weather_data.vc_index == 3)
    	 {
    	    PRINTF("\r\n*******************today weather***********************************\n\r");
    	    PRINTF("%s\r\n",today_weather);
    	    return;
    	 }

     }
     else
    	 return;
     datap1 = strstr(datap+4,"date");
     if(datap1 != NULL)
     {
          memcpy(tomorr_weather, datap1,184);//max 1457
          if(weather_data.vc_index == 4)
          {
         	 PRINTF("\r\n*******************tomorrow weather*******************************\n\r");
         	 PRINTF("%s\r\n",tomorr_weather);
         	 return;
          }
     }
     else
    	 return;
     datap = strstr(datap1+4,"date");
     if(datap != NULL)
     {
          memcpy(aftertom_weather, datap,184);//max 1457
          if(weather_data.vc_index == 5)
          {
         	 PRINTF("\r\n*******************after tomorrow weather**************************\n\r");
         	 PRINTF("%s\r\n",aftertom_weather);
          }
     }
     else
    	 return;
}

Function Weather_process is used to refer to the voice recognized weather number to get the related date’s weather, and printf it.

4 Test result

the test result video:

(view in My Videos)

Print the log results as shown in Figure 11, after testing, you can see that the wakeup word and voice command can be successfully recognized, in the recognition of word sequence numbers 3, 4, 5 is the weather acquisition, you can successfully call the lwip socket client API, successfully obtain weather information and printf it.

Fig 11 system test print result

evkmimxrt1060_maestro_weather_backup.zip is the project without sound playback, weather information will print to the terminal!

5 Meet issues conclusion

5.1 LWIP failed to get weather

When creating the code, call the postman provided http code:

GET /v3/weather/weatherInfo?key=8f777fc7d867908eebbad7f96a13af10& city=310000& extensions=all& output=JSON HTTP/1.1

Host: restapi.amap.com

Add it to the socket API function:

uint8_t get_weather[]= "GET /v3/weather/weatherInfo?key=xxx&city=310000&extensions=all&output=JSON HTTP/1.1\r\nHost: restapi.amap.com\r\n\r\n\r\n\r\n";

The test result is:

Fig 12 socket weather API return issues

We can see, server connection is OK, http also return back the data, but it report the parameter issues, after checking, we use the postman C code, and put it to the get_weather:

uint8_t get_weather[]= "GET /v3/weather/weatherInfo?key=xxx&city=310000&extensions=all&output=JSON HTTP/1.1\r\nHost: restapi.amap.com\r\n\r\n\r\n\r\n";

Then, it can capture the weather data, the same as postman test result.

5.2 VIT LWIP merger memory is not enough

After combining the maestro_record and lwip socket code together, compile it, it will meet the DTCM memory overflow issues.

Fig 13 memory overflow

After optimize, still meet the DTCM overflow issues, so, at last, choose to reconfigure the FlexRAM:

OCRAM 192K, DTCM 256K, ITCM 64K

Compile it, and the memory overflow issues disappear:

Fig 14 FlexRAM recofiguration

5.3 Print Chinese word in tera

Directly use teraterm, when the weather API returns the Chinese word, the print out information is the garbled code, and then after the following configuration, to achieve Chinese printing:

Setup -> Terminal

Locale : american->chinese

Codepage : 65001 ->936

Fig 15 Tera Term Chinese word print

In summary, after various data collection and problem solving, in MIMXRT1060-EVK board combined with the official SDK complete the function of customizing VIT voice commands to obtain real-time weather and local control.So, even if the ordinary RT series which is not S/L/A/F series, you also can use VIT to implement speech recognition functions.

6 Add the sound broadcast

This chapter mainly gives the method how to add the sound broadcast with the mp3 video data which is stored in the memory, but to the realtime weather data playback, it is not very freely, it needs to check the weather data, and use the video mp3 data lib get the correct mp3 data, as it is not the online TTS method.

So, here, just share one example add the sound broadcast, eg:

WW : “小恩小恩” -> “小恩来了，请吩咐！”

VC ：“今天天气” -> “温度32.1度”

VC playback is fixed now, if need to play real data, it needs to generate the mp3 voice data lib, then according to the feedback weather information, to generate the correct weather mp3 data array, and play it, as this is a little complicated, but not difficult, so here, just use one fixed sound give an example of it.

6.1 MP3 playback audio data preparation

For audio broadcasting which need to convert the Chinese word into MP3 files, you can use some online speech synthesis software, here use Baidu online speech synthesis function, you can view the previous article, chapter 2.2.2 online speech synthesis:

https://community.nxp.com/t5/i-MX-RT-Knowledge-Base/RT106L-S-voice-control-system-based-on-the-Baidu...

If use the Baidu online speech synthesis generated mp3 file to convert to the c array directly, it will meet the first audio play issues, so, here we use the Audacity to convert the mp3 file, the convert configuration is like this:

Fig 16 Audacity convert configuration

After the regeneration of mp3, you can use xxd .exe to convert the mp3 file to an array of C files, and then put it into RT-related memory or external flash , xxd .exe can be found at the following link:

https://github.com/baldram/ESP_VS1053_Library/issues/18

The convert command like this:

xxd -i your-sound.mp3 ready-to-use-header.c

Convert the xiaoencoming.mp3 and temptest.mp3 file to the C array, then modify the data to the C file, save file as: xiaoencoming.h and temptest.h.

Here, take xiaoencoming.c as an example:

#define XIAOEN_MP3_SIZE 6847

unsigned char xiaoencoming_mp3[XIAOEN_MP3_SIZE] = {

0x49, 0x44, 0x33, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x21, 0x54, 0x58,

…

0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55

};

unsigned int xiaoencoming1_mp3_len = XIAOEN_MP3_SIZE;//6847;

Until now, the playback audio data is finished.

Copy xiaoencoming.h and temptest.h to project path:

evkmimxrt1060_maestro_weather_mp3\source

6.2 Play the MP3 data from memory

Here, share the related code.

6.2.1 app_streamer.c added code

#include "xiaoencoming.h"
#include "temptest.h"
void *voice_inBuf                   = NULL;
void *voice_outBuf                  = NULL;
status_t STREAMER_file_Create(streamer_handle_t *handle, char *filename, int eap_par)
{
    STREAMER_CREATE_PARAM params;
    OsaThreadAttr thread_attr;
    int ret;
	ELEMENT_PROPERTY_T prop;
	MEMSRC_SET_BUFFER_T inBufInfo = {0};
	SET_BUFFER_DESC_T outBufInfo  = {0};
	PRINTF("Kerry test begin!\r\n");
	if(filename == "temptest.mp3")
		inBufInfo  = (MEMSRC_SET_BUFFER_T){.location = (int8_t *)temptest_mp3, .size = TEMPtest_MP3_SIZE};
	else if(filename == "xiaoencoming.mp3")
		inBufInfo  = (MEMSRC_SET_BUFFER_T){.location = (int8_t *)xiaoencoming_mp3, .size = XIAOEN_MP3_SIZE};
    /* Create message process thread */
    osa_thread_attr_init(&thread_attr);
    osa_thread_attr_set_name(&thread_attr, STREAMER_MESSAGE_TASK_NAME);
    osa_thread_attr_set_stack_size(&thread_attr, STREAMER_MESSAGE_TASK_STACK_SIZE);
    ret = osa_thread_create(&msg_thread, &thread_attr, STREAMER_MessageTask, (void *)handle);
    osa_thread_attr_destroy(&thread_attr);
    if (ERRCODE_NO_ERROR != ret)
    {
        return kStatus_Fail;
    }
    /* Create streamer */
    strcpy(params.out_mq_name, APP_STREAMER_MSG_QUEUE);
    params.stack_size = STREAMER_TASK_STACK_SIZE;
    params.pipeline_type = STREAM_PIPELINE_MEM;
    params.task_name    = STREAMER_TASK_NAME;
    params.in_dev_name   = "buffer";
    params.out_dev_name  = "speaker";
    handle->streamer = streamer_create(&params);
    if (!handle->streamer)
    {
        return kStatus_Fail;
    }
    prop.prop = PROP_DECODER_DECODER_TYPE;
    prop.val  = (uintptr_t)DECODER_TYPE_MP3;
    ret       = streamer_set_property(handle->streamer, prop, true);
    if (ret != STREAM_OK)
    {
        streamer_destroy(handle->streamer);
        handle->streamer = NULL;
        return kStatus_Fail;
    }
    prop.prop = PROP_MEMSRC_SET_BUFF;
    prop.val  = (uintptr_t)&inBufInfo;
    ret       = streamer_set_property(handle->streamer, prop, true);
    if (ret != STREAM_OK)
    {
        streamer_destroy(handle->streamer);
        handle->streamer = NULL;

        return kStatus_Fail;
    }
    handle->audioPlaying = false;
error:
    PRINTF("End STREAMER_file_Create\r\n");
    PRINTF("Kerry test end!\r\n");
    return kStatus_Success;
}

The code implements the thread build, creates a streamer, defines it as playing from memory, decodes the properties for MP3, and specifies an array of MP3 files in memory. Specify a different array of mp3 files in memory based on the calling file name.

6.2.2 cmd.c added code

void play_file(char *filename, int eap_par)
{
    STREAMER_Init();
    int ret = STREAMER_file_Create(&streamerHandle, filename, eap_par);
    if (ret != kStatus_Success)
    {
        PRINTF("STREAMER_file_Create failed\r\n");
        goto file_error;
    }

    STREAMER_Start(&streamerHandle);
    PRINTF("Starting playback\r\n");
    file_playing = true;
    while (streamerHandle.audioPlaying)
    {
        osa_time_delay(100);
    }
    file_playing = false;

file_error:
    PRINTF("[play_file] Cleanup\r\n");
    STREAMER_Destroy(&streamerHandle);
    osa_time_delay(100);
}

Play file, it calls the STREAMER_file_Create API function, start play, and wait the play finished, then release the STREAMER.

shellRecMIC API function add the VIT recorded flag, which is used to play feedback audio file.

static shell_status_t shellRecMIC(shell_handle_t shellHandle, int32_t argc, char **argv)
{
…
     //kerry
    PRINTF("Kerry MP3 stream data test!\r\n");
    PRINTF("---weather_data.ww_flag =%d--\r\n ", weather_data.ww_flag);
    PRINTF("---weather_data.vc_inde =%d--\r\n ", weather_data.vc_index);
    PRINTF("---weather_data.mp3_flag =%d--\r\n ", weather_data.mp3_flag);
    if(weather_data.ww_flag == 1)
    {
    	play_file("xiaoencoming.mp3", 0);
    }
    if(weather_data.vc_index == 3)
    {
    	play_file("temptest.mp3", 0);
    }

    if(weather_data.mp3_flag != 0)
    {
	     weather_data.ww_flag = 0;
	     weather_data.vc_index = 0;
    }
    weather_data.mp3_flag = 0;
    /* Delay for cleanup */
    osa_time_delay(100);
    return kStatus_SHELL_Success;
}

If detect the Wakeup Word: “小恩小恩”, play feedback audio: “小恩来了请吩咐”.

If detect the voice command: “今天天气”， play feedback audio: “温度32.1度”, please note, this playback just an example, it is the fixed audio, you also can create audio word lib, then according to the received weather information, combine the related word audio together, then playback it. This is a little complicated, but not difficult. So, if need to play the free audio, also can consider the online TTS method in real time.

6.2.3 VIT WW and VC flag

VIT_Execute function

int VIT_Execute(void *arg, void *inputBuffer, int size)
{    
…
    if (VIT_DetectionResults == VIT_WW_DETECTED)
    {
        PRINTF(" - WakeWord detected \r\n");
        weather_data.ww_flag =  1; //kerry
        weather_data.mp3_flag = 1;
    }
    else if (VIT_DetectionResults == VIT_VC_DETECTED)
    {
        // Retrieve id of the Voice Command detected
        // String of the Command can also be retrieved (when WW and CMDs strings are integrated in Model)
        VIT_Status = VIT_GetVoiceCommandFound(VITHandle, &VoiceCommand);
        if (VIT_Status != VIT_SUCCESS)
        {
            PRINTF("VIT_GetVoiceCommandFound error: %d\r\n", VIT_Status);
            return VIT_Status; // will stop processing VIT and go directly to MEM free
        }
        else
        {
            PRINTF(" - Voice Command detected %d", VoiceCommand.Cmd_Id);
            weather_data.vc_index = VoiceCommand.Cmd_Id;//kerry 1:ledon 2:ledoff 3:today weather 4:tomorrow weather 5:aftertomorrow weather
            weather_data.mp3_flag = 2;
            if(weather_data.vc_index == 1)//1
            {
        		GPIO_PinWrite(GPIO1, 3, 1U); //pull high
        		PRINTF(" led on!\r\n");

            }
            else if(weather_data.vc_index == 2)//2
            {
        		GPIO_PinWrite(GPIO1, 3, 0U); //pull low
        		PRINTF(" led off!\r\n");
            }

            // Retrieve CMD Name: OPTIONAL
            // Check first if CMD string is present
            if (VoiceCommand.pCmd_Name != PL_NULL)
            {
                PRINTF(" %s\r\n", VoiceCommand.pCmd_Name);

            }
            else
            {
                PRINTF("\r\n");
            }
        }
    }
    return VIT_Status;
}

Until now, all the code is added.

6.2.4 playback audio test result

This is the audio playback test result:

(view in My Videos)

Fig 17 playback audio log

From the test result, we can see, we also can use the mp3 data which is stored in the memory and play it as audio playback.

The code project is: evkmimxrt1060_maestro_weather_mp3.zip.

The RT speech recognition system based on VIT to obtain weather information

The RT speech recognition system based on VIT to obtain weather information

The RT speech recognition system based on VIT to obtain weather information

i.MXRT 106x