Real-time DSP using the LPC55S69 i2s Examples

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Real-time DSP using the LPC55S69 i2s Examples

2,819 Views
mwickert
Contributor II

I currently experimenting with the LPC55S69 i2s examples i2s_interrupt_record_playback and i2s_dma_record_playback examples. My starting point has been the interrupt version example. I want low latency and the ability to perform DSP math on the signal samples arriving at the ADC side of the wm8904 codec and then send the modified signal samples to the DAC side of the wm8904 codec. The key code as I see it are the two callbacks:

static void TxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData)

{

    /* Enqueue the same original s_Buffer all over again */

    i2s_transfer_t *transfer = (i2s_transfer_t *)userData;

    I2S_TxTransferNonBlocking(base, handle, *transfer);

}

static void RxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData)

{

    /* Enqueue the same original s_Buffer all over again */

    i2s_transfer_t *transfer = (i2s_transfer_t *)userData;

    I2S_RxTransferNonBlocking(base, handle, *transfer);

}

 Where should be I redoing the DSP math? What changes to to I2S code can/should be made to maximize the processing time available to the DSP math operations. I would ultimately like to be using the PowerQuad and related CMSIS-DSP functions. When I use a logic analyzer to monitor entry to the two callbacks I a lot of "thrashing" between the two callbacks. Do I or can I change the configuration of the callbacks to the Tx waits until the Rx is done with sample processing and can now update the Tx buffer? Out their app notes that I can read?

Also, I am using MCUXpresso IDE v11.1.1 [Build 3241] [2020-03-02]

and 55S69 SDK 2.7.1.

Thanks any your help you can provide.

Tags (1)
6 Replies

2,594 Views
mwickert
Contributor II

Hello Sabina,

I have finally taken some time to look at the documents you have referenced. I have not read in detail, but they do appear to be useful. I am indeed familiar with ping-pong input and output buffers and the fact that the latency will be no less than the sum of the buffer lengths, say N_in + N_out. Without reading the papers in depth it is still unclear whether I need to create another IRQ to do my desired signal processing with the ping-pong buffers or have my processing integrated with either the Rx or Tx callbacks. I guess I need to read the referenced papers to understand this better.

In the end I do think you have pushed me in the right direction and hopefully will reach may objective.

Thank you

Mark Wickert 

0 Kudos

2,594 Views
mwickert
Contributor II

I have moved forward with a ping-pong approach using I2S interrupts and placing the DSP workload (transfer and processing of input samples to output samples) in the while {} loop at the end of main {}. My thought was that the while the Tx() and Rx() I2S callbacks were doing their thing of shuffling ADC/DAC samples via I2S, there would be plenty of processor time to do some processing with the samples passing from input to output. This does not work as expected and now I am stuck on what to do try next. Samples are getting dropped regardless of the buffer length I choose, so the waveform is discontinuous. My code listing is below followed by a scope screenshot. Not here I am using a sinusoid LUT to send 1kHz and 2 kHz sinusoid samples to the output buffers, but input to output streaming works similarly.

/*

 * Copyright (c) 2016, Freescale Semiconductor, Inc.

 * Copyright 2016-2017 NXP

 * All rights reserved.

 *

 * SPDX-License-Identifier: BSD-3-Clause

 */

/*******************************************************************************

 * Includes

 ******************************************************************************/

#include "fsl_device_registers.h"

#include "fsl_debug_console.h"

#include "board.h"

#include "fsl_i2c.h"

#include "fsl_i2s.h"

#include "fsl_wm8904.h"

#include "music.h"

#include "fsl_codec_common.h"

#include <stdbool.h>

#include "pin_mux.h"

#include "fsl_sysctl.h"

#include "fsl_codec_adapter.h"

/*******************************************************************************

 * Definitions

 ******************************************************************************/

#define DEMO_I2C (I2C4)

#define DEMO_I2S_MASTER_CLOCK_FREQUENCY 24576000

#define DEMO_I2S_TX (I2S7)

#define DEMO_I2S_RX (I2S6)

#define DEMO_I2S_CLOCK_DIVIDER (CLOCK_GetPll0OutFreq() / 48000U / 16U / 2U)

#define DEMO_I2S_RX_MODE (kI2S_MasterSlaveNormalSlave)

#define DEMO_I2S_TX_MODE (kI2S_MasterSlaveNormalMaster)

#define DEMO_AUDIO_BIT_WIDTH (16)

#define DEMO_AUDIO_SAMPLE_RATE (48000)

#define DEMO_AUDIO_PROTOCOL kCODEC_BusI2S

// total latency will be 2*Nsamps

#define NSAMPS 50

// two 16-bit channels, i.e., left and right audio ++> 4*Nsamps

#define NBYTES 200

#define N_period 48

// 48 sample cosine lookup table with Apeak = 15000

int16_t onek_table[48] = {15000,  14871,  14488,  13858,  12990,

                          11900,  10606,   9131,   7500,   5740,

                           3882,   1957,      0,  -1957,  -3882,

                          -5740,  -7499,  -9131, -10606, -11900,

                         -12990, -13858, -14488, -14871, -15000,

                         -14871, -14488, -13858, -12990, -11900,

                         -10606,  -9131,  -7500,  -5740,  -3882,

                          -1957,      0,   1957,   3882,   5740,

                           7500,   9131,  10606,  11900,  12990,

                          13858,  14488,  14871};

int16_t table_indexL = 0;

int16_t table_indexR = 0;

/*******************************************************************************

 * Prototypes

 ******************************************************************************/

static void StartDigitalLoopback(void);

static void TxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData);

static void RxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData);

/*******************************************************************************

 * Variables

 ******************************************************************************/

wm8904_config_t wm8904Config = {

    .i2cConfig    = {.codecI2CInstance = BOARD_CODEC_I2C_INSTANCE, .codecI2CSourceClock = BOARD_CODEC_I2C_CLOCK_FREQ},

    .recordSource = kWM8904_RecordSourceLineInput,

    .recordChannelLeft  = kWM8904_RecordChannelLeft2,

    .recordChannelRight = kWM8904_RecordChannelRight2,

    .playSource         = kWM8904_PlaySourceDAC,

    .slaveAddress       = WM8904_I2C_ADDRESS,

    .protocol           = kWM8904_ProtocolI2S,

    .format             = {.sampleRate = kWM8904_SampleRate48kHz, .bitWidth = kWM8904_BitWidth16},

    .mclk_HZ            = DEMO_I2S_MASTER_CLOCK_FREQUENCY,

    .master             = false,

};

codec_config_t boardCodecConfig = {.codecDevType = kCODEC_WM8904, .codecDevConfig = &wm8904Config};

// Define ping/pong Tx and Rx buffers (4 total).

// Manipulate them using pointers

static int16_t *tx_buf;

static int16_t *rx_buf;

// Example started with N_bytes = 400

//__ALIGN_BEGIN static uint8_t s_Buffer[200] __ALIGN_END; /* 100 samples => time about 2 ms */

//Tx transfer Pair

static bool tx_pp = false;

static bool tx_ready = false;

__ALIGN_BEGIN static uint8_t tx_ping[NBYTES] __ALIGN_END;

__ALIGN_BEGIN static uint8_t tx_pong[NBYTES] __ALIGN_END;

//Rx transfer pair

static bool rx_pp = false;

static bool rx_ready = false;

__ALIGN_BEGIN static uint8_t rx_ping[NBYTES] __ALIGN_END;

__ALIGN_BEGIN static uint8_t rx_pong[NBYTES] __ALIGN_END;

static i2s_config_t s_TxConfig;

static i2s_config_t s_RxConfig;

static i2s_handle_t s_TxHandle;

static i2s_handle_t s_RxHandle;

static i2s_transfer_t s_TxTransfer_ping;

static i2s_transfer_t s_RxTransfer_ping;

static i2s_transfer_t s_TxTransfer_pong;

static i2s_transfer_t s_RxTransfer_pong;

extern codec_config_t boardCodecConfig;

codec_handle_t codecHandle;

/*******************************************************************************

 * Code

 ******************************************************************************/

void BOARD_InitSysctrl(void)

{

    SYSCTL_Init(SYSCTL);

    /* select signal source for share set */

    SYSCTL_SetShareSignalSrc(SYSCTL, kSYSCTL_ShareSet0, kSYSCTL_SharedCtrlSignalSCK, kSYSCTL_Flexcomm7);

    SYSCTL_SetShareSignalSrc(SYSCTL, kSYSCTL_ShareSet0, kSYSCTL_SharedCtrlSignalWS, kSYSCTL_Flexcomm7);

    /* select share set for special flexcomm signal */

    SYSCTL_SetShareSet(SYSCTL, kSYSCTL_Flexcomm7, kSYSCTL_FlexcommSignalSCK, kSYSCTL_ShareSet0);

    SYSCTL_SetShareSet(SYSCTL, kSYSCTL_Flexcomm7, kSYSCTL_FlexcommSignalWS, kSYSCTL_ShareSet0);

    SYSCTL_SetShareSet(SYSCTL, kSYSCTL_Flexcomm6, kSYSCTL_FlexcommSignalSCK, kSYSCTL_ShareSet0);

    SYSCTL_SetShareSet(SYSCTL, kSYSCTL_Flexcomm6, kSYSCTL_FlexcommSignalWS, kSYSCTL_ShareSet0);

}

/*!

 * @brief Main function

 */

int main(void)

{

static int16_t n;

    CLOCK_EnableClock(kCLOCK_InputMux);

    CLOCK_EnableClock(kCLOCK_Iocon);

    CLOCK_EnableClock(kCLOCK_Gpio0);

    CLOCK_EnableClock(kCLOCK_Gpio1);

    /* USART0 clock */

    CLOCK_AttachClk(BOARD_DEBUG_UART_CLK_ATTACH);

    /* I2C clock */

    CLOCK_AttachClk(kFRO12M_to_FLEXCOMM4);

    PMC->PDRUNCFGCLR0 |= PMC_PDRUNCFG0_PDEN_XTAL32M_MASK;   /*!< Ensure XTAL16M is on  */

    PMC->PDRUNCFGCLR0 |= PMC_PDRUNCFG0_PDEN_LDOXO32M_MASK;  /*!< Ensure XTAL16M is on  */

    SYSCON->CLOCK_CTRL |= SYSCON_CLOCK_CTRL_CLKIN_ENA_MASK; /*!< Ensure CLK_IN is on  */

    ANACTRL->XO32M_CTRL |= ANACTRL_XO32M_CTRL_ENABLE_SYSTEM_CLK_OUT_MASK;

    CLOCK_AttachClk(kEXT_CLK_to_PLL0);

    const pll_setup_t pll0Setup = {

        .pllctrl = SYSCON_PLL0CTRL_CLKEN_MASK | SYSCON_PLL0CTRL_SELI(8U) | SYSCON_PLL0CTRL_SELP(31U),

        .pllndec = SYSCON_PLL0NDEC_NDIV(125U),

        .pllpdec = SYSCON_PLL0PDEC_PDIV(8U),

        .pllsscg = {0x0U, (SYSCON_PLL0SSCG1_MDIV_EXT(3072U) | SYSCON_PLL0SSCG1_SEL_EXT_MASK)},

        .pllRate = 24576000U,

        .flags   = PLL_SETUPFLAG_WAITLOCK,

    };

    /*!< Configure PLL to the desired values */

    CLOCK_SetPLL0Freq(&pll0Setup);

    /* Attach PLL clock to MCLK for I2S, no divider */

    CLOCK_AttachClk(kPLL0_to_MCLK);

    SYSCON->MCLKDIV = SYSCON_MCLKDIV_DIV(0U);

    SYSCON->MCLKIO  = 1U;

    CLOCK_SetClkDiv(kCLOCK_DivPll0Clk, 0U, true);

    CLOCK_SetClkDiv(kCLOCK_DivPll0Clk, 1U, false);

    /*!< Switch PLL0 clock source selector to XTAL16M */

    /* I2S clocks */

    CLOCK_AttachClk(kPLL0_DIV_to_FLEXCOMM6);

    CLOCK_AttachClk(kPLL0_DIV_to_FLEXCOMM7);

    /* reset FLEXCOMM for I2C */

    RESET_PeripheralReset(kFC4_RST_SHIFT_RSTn);

    /* reset FLEXCOMM for I2S */

    RESET_PeripheralReset(kFC6_RST_SHIFT_RSTn);

    RESET_PeripheralReset(kFC7_RST_SHIFT_RSTn);

    NVIC_ClearPendingIRQ(FLEXCOMM6_IRQn);

    NVIC_ClearPendingIRQ(FLEXCOMM7_IRQn);

    /* Enable interrupts for I2S */

    EnableIRQ(FLEXCOMM6_IRQn);

    EnableIRQ(FLEXCOMM7_IRQn);

    /* Initialize the rest */

    BOARD_InitPins();

    BOARD_BootClockFROHF96M();

    BOARD_InitDebugConsole();

    BOARD_InitSysctrl();

    PRINTF("Configure WM8904 codec\r\n");

    /* protocol: i2s

     * sampleRate: 48K

     * bitwidth:16

     */

    if (CODEC_Init(&codecHandle, &boardCodecConfig) != kStatus_Success)

    {

        PRINTF("WM8904_Init failed!\r\n");

    }

    /* Initial volume kept low for hearing safety. */

    /* Adjust it to your needs, 0x0006 for -51 dB, 0x0039 for 0 dB etc. */

    CODEC_SetVolume(&codecHandle, kCODEC_PlayChannelHeadphoneLeft | kCODEC_PlayChannelHeadphoneRight, 0x0030);

    PRINTF("Configure I2S\r\n");

    /*

     * masterSlave = kI2S_MasterSlaveNormalMaster;

     * mode = kI2S_ModeI2sClassic;

     * rightLow = false;

     * leftJust = false;

     * pdmData = false;

     * sckPol = false;

     * wsPol = false;

     * divider = 1;

     * oneChannel = false;

     * dataLength = 16;

     * frameLength = 32;

     * position = 0;

     * watermark = 4;

     * txEmptyZero = true;

     * pack48 = false;

     */

    I2S_TxGetDefaultConfig(&s_TxConfig);

    s_TxConfig.divider     = DEMO_I2S_CLOCK_DIVIDER;

    s_TxConfig.masterSlave = DEMO_I2S_TX_MODE;

    /*

     * masterSlave = kI2S_MasterSlaveNormalSlave;

     * mode = kI2S_ModeI2sClassic;

     * rightLow = false;

     * leftJust = false;

     * pdmData = false;

     * sckPol = false;

     * wsPol = false;

     * divider = 1;

     * oneChannel = false;

     * dataLength = 16;

     * frameLength = 32;

     * position = 0;

     * watermark = 4;

     * txEmptyZero = false;

     * pack48 = false;

     */

    I2S_RxGetDefaultConfig(&s_RxConfig);

    s_RxConfig.divider     = DEMO_I2S_CLOCK_DIVIDER; // was s_TxConfig.divider

    s_RxConfig.masterSlave = DEMO_I2S_RX_MODE;

    I2S_TxInit(DEMO_I2S_TX, &s_TxConfig);

    I2S_RxInit(DEMO_I2S_RX, &s_RxConfig);

    StartDigitalLoopback();

    while (1)

    {

    // Begin DSP processing using the int16_t pointers tx_buf and rx_buf to the ping or pong buffers

    // For a LUT sinusoid put table values in left and right channels using a modulo table index

    if ((tx_ready && rx_ready)) {

    for (n=0; n < NSAMPS; n++) {

tx_buf[2*n] = onek_table[table_indexL]; //rx_buf[2*n]; // Process left samples

tx_buf[2*n+1] = onek_table[table_indexR]; //rx_buf[2*n+1]; // Process right samples

table_indexL = (table_indexL + 1) % N_period; // stride by 1 <=> 1 kHz

table_indexR = (table_indexR + 2) % N_period; // stride by 2 <=> 2 kHz

    }

    tx_ready = false;

    rx_ready = false;

    }

    }

}

static void StartDigitalLoopback(void)

{

    PRINTF("Setup digital loopback\r\n");

    s_TxTransfer_ping.data     = &tx_ping[0]; // &s_Buffer[0];

    s_TxTransfer_ping.dataSize = sizeof(tx_ping); // was s_Buffer, etc.

    s_RxTransfer_ping.data     = &rx_ping[0]; // was s_Buffer, etc.

    s_RxTransfer_ping.dataSize = sizeof(rx_ping); // was s_Buffer, etc.

    s_TxTransfer_pong.data     = &tx_pong[0]; // &s_Buffer[0];

s_TxTransfer_pong.dataSize = sizeof(tx_pong); // was s_Buffer, etc.

s_RxTransfer_pong.data     = &rx_pong[0]; // was s_Buffer, etc.

s_RxTransfer_pong.dataSize = sizeof(rx_pong); // was s_Buffer, etc.

    I2S_TxTransferCreateHandle(DEMO_I2S_TX, &s_TxHandle, TxCallback, NULL); //(void *)&s_TxTransfer);

    I2S_RxTransferCreateHandle(DEMO_I2S_RX, &s_RxHandle, RxCallback, NULL); //(void *)&s_RxTransfer);

    I2S_RxTransferNonBlocking(DEMO_I2S_RX, &s_RxHandle, s_RxTransfer_ping);

    I2S_TxTransferNonBlocking(DEMO_I2S_TX, &s_TxHandle, s_TxTransfer_ping);

    /* Enqueue next buffer right away so there is no drop in audio data stream when the first buffer finishes */

    I2S_RxTransferNonBlocking(DEMO_I2S_RX, &s_RxHandle, s_RxTransfer_pong);

    I2S_TxTransferNonBlocking(DEMO_I2S_TX, &s_TxHandle, s_TxTransfer_pong);

    // Start with pointer tx processing buffer pointed to pong

    tx_buf = (int16_t *) tx_pong;

    // Start with pointer rx processing buffer pointed to pong

    rx_buf = (int16_t *) rx_pong;

}

static void TxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData)

{

    /* Enqueue the same original buffer all over again */

if (completionStatus == kStatus_I2S_BufferComplete)

{

tx_ready = true;

/* Enqueue next buffer */

if (tx_pp) {

I2S_TxTransferNonBlocking(base, handle, s_TxTransfer_pong);

// Set pointer to proper left & right interleaved tx processing buffer

tx_buf = (int16_t *) tx_ping;

// Begin filling the ping buffer

tx_pp = false;

}

else {

I2S_TxTransferNonBlocking(base, handle, s_TxTransfer_ping);

// Set pointer to proper left & right interleaved tx processing buffer

rx_buf = (int16_t *) tx_ping;

// Begin filling the ping buffer

tx_pp = true;

}

tx_ready = true;

}

}

static void RxCallback(I2S_Type *base, i2s_handle_t *handle, status_t completionStatus, void *userData)

{

//    /* Enqueue the same original buffer all over again */

if (completionStatus == kStatus_I2S_BufferComplete)

{

// rx_ready = true;

/* Enqueue next buffer */

if (tx_pp) {

I2S_RxTransferNonBlocking(base, handle, s_RxTransfer_pong);

// Set pointer to proper left & right interleaved rx processing buffer

rx_buf = (int16_t *) rx_ping;

// Begin emptying the pong buffer

rx_pp = false;

}

else {

I2S_RxTransferNonBlocking(base, handle, s_RxTransfer_ping);

// Set pointer to proper left & right interleaved rx processing buffer

rx_buf = (int16_t *) rx_pong;

// Begin emptying the ping buffer

rx_pp = true;

}

rx_ready = true;

}

}

Screen shot when using 50 sample input and output ping-pong buffers, which give ~2ms end-to-end latency.

pastedImage_1.png 

Any suggestions? Should I have a dedicated interrupt while I transfer and process the samples?

Thanks

0 Kudos

2,594 Views
Sabina_Bruce
NXP Employee
NXP Employee

Hope you are well. I've been checking your code with detail and the description  you provided. I believe the interrupts are fine, they should remain as simple as possible to not bring any unnecessary delays. I noticed in your transfer call back you are filling the ping on both conditions of the IF statement. Is this correct?

An additional feature that you could implement is as soon as you finish the process of the sample trigger DMA transfer. You may use the example provided in the SDK called i2s_dma_record_playback. It is pretty much what you have in your code above with the added DMA.

Let me know your results.

Sabina

0 Kudos

2,594 Views
Sabina_Bruce
NXP Employee
NXP Employee

Hello Mark,

Hope you are doing well.

The examples are created as a base/reference to begin your application. So, essentially yes you are able to modify the examples as you see fit, however please consider that customization may affect either positively or negatively the performance.In addition, any changes to the examples we cannot guarantee that the expected behavior will occur. 

For example code and application note regarding the DSP using LPC55xx, please check the following links:

Digital Signal Processing for NXP LPC5500 Using PowerQuad 

AN12282SW.zip 

Towards the end of the document you will find, a demo project, a page is setup for the comparison between the PowerQuad and Arm CMSIS-DSP when they are running the same tasks, so that they can achieve the highest performance. You will find the considerations to take and where to place code based on those demos.

Hope this helps!

Best Regards,

Sabina

-----------------------------------------------------------------------------------------------------------------------

Note: If this post answers your question, please click the Correct Answer button. Thank you!

----------------------------------------------------------------------------------------------------------------------- 

0 Kudos

2,594 Views
mwickert
Contributor II

Sabina,

Both of the reference links are helpful as I have looked the PowerQuad vs CMSIS-DSP document shortly after it was posted. I also understand that I can customize the examples. This is what I started to do last fall, but had to step away for a while. Now that I have put the newest MCUXpresso on a new machine and have the newest LPC55S69 SDK I want to make headway on understanding how to code real-time DSP using I2S API (MCUXpresso SDK API Reference Manual: I2S Driver ). I need help in understanding how to work with the I2S API to manage signal sample flow using the TX and RX callbacks that I pasted in my original post.

On other processors I have configured an interrupt (ISR) that fires when I2S codec data is ready to be processed by a DSP algorithm. In particular suppose there is one 32 bit I2S word of 4-bytes (left and right 16-bit audio samples) that needs to be sent through a DSP algorithm one sample at a time (perhaps CMSIS-DSP filters). When the ISR fires I write code in the ISR to process the two received 16-bit samples through a filter and then take the filters samples and return them to a 4-byte word for I2S transmission back to the codec. My question is how to I make this work using the NXP I2S API? I do not know how to use the two nonblocking callback functions in the example code with an ISR that I am describing in the paragraph. FYI, the "other" processor is a Cypress M4.

I hope you or someone on the forum can point me in the right direction. 

0 Kudos

2,594 Views
Sabina_Bruce
NXP Employee
NXP Employee

Hello Mark,

Hope you are doing well.

To get started with the I2S API, I would have recommended the examples that you are already working with. We do not have a guide on how to begin making changes according to different needs. Based on what you comment on your first post and this last one I believe you should work with ping pong buffers for your data. The reason is, that if you wait for TX to finish before enabling RX and vice versa, you will have many delays and the time will not be used efficiently. However, implementing a sort of pin-pong logic you could be receiving or transferring data while working on the data in the background. Although we don't have an example of this using i2s, you can find the principal of it in the multicore examples which use this method. You can refer to this application which describes a bit of this topic using the Kinetis, but the concepts are what you will find useful I believe.

An I2S (Inter-IC Sound Bus) Application on Kinetis

Audio Output Options for Kinetis

Using Synchronous Audio Interface (SAI) on S32K148

This thread has some example codes attached, you may also find useful for your application:

https://community.nxp.com/thread/81904

Best Regards,

Sabina

-----------------------------------------------------------------------------------------------------------------------

Note: If this post answers your question, please click the Correct Answer button. Thank you!

-----------------------------------------------------------------------------------------------------------------------