Hello,
I use PowerQuad private RAM in LPCXpresso55S69 development board to accelerate computation.
From documentation PowerQuad private RAM has size of 16kB, but I have problem with addressing upper half. It seems that data from lower half of private RAM are overwritten if upper half are used.
I add sample project when write to upper half of private RAM causes that data is overwritten. When there is no call to upper half, example works OK.
/*******************************************************************************
* Definitions
******************************************************************************/
#define DEMO_POWERQUAD POWERQUAD
#define EXAMPLE_ASSERT_TRUE(x) \
if (!(x)) \
{ \
PRINTF("%s error\r\n", __func__); \
while (1) \
{ \
} \
}
#define EXAMPLE_FIR_DATA_LEN 256
#define EXAMPLE_FIR_TAP_LEN 256
/*
* Power Quad driver uses the first 4K private RAM, the RAM starts from 0xE0001000
* could be used for other purpose.
*/
#define EXAMPLE_PRIVATE_RAM_1 ((float *)0xE0001000)
#define EXAMPLE_PRIVATE_RAM_3 ((float *)0xE0003000)
/* Float FIR */
static void PQ_FIRFloatExample(void)
{
uint32_t i;
pq_config_t pqConfig;
PQ_GetDefaultConfig(&pqConfig);
PQ_SetConfig(DEMO_POWERQUAD, &pqConfig);
// Move data that will be moved back to buffer and compared, placed from 4kB of memory
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_FIR_TAP_LEN / 16, 0), 1.0f, s_firTaps,
EXAMPLE_PRIVATE_RAM_1);
PQ_WaitDone(DEMO_POWERQUAD);
// Move other data in private PowerQuad memory, placed from 12kB of memory
// If this block is commented, example works, no overwrite
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_FIR_TAP_LEN / 16, 0), 1.0f, s_firOutputRef,
EXAMPLE_PRIVATE_RAM_3);
PQ_WaitDone(DEMO_POWERQUAD);
// Move back table from memory placed form 4kB of memory
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_FIR_TAP_LEN / 16, 0), 1.0f , EXAMPLE_PRIVATE_RAM_1,
PowerQuadOutput);
PQ_WaitDone(DEMO_POWERQUAD);
// Data moved back to main memory are from second table not from first. Assert fails
for (i = 0; i < EXAMPLE_FIR_DATA_LEN; i++)
{
EXAMPLE_ASSERT_TRUE(fabs(PowerQuadOutput[i] - s_firTaps[i]) < 0.00001);
}
}
Could please check this and help with problem?
Best regards,
Mateusz
Hi xiangjun_rong,
I modified SDK example "lpcxpresso55s69_powerquad_fir_fast" to present problem. I use every solution that you provide:
1. TMPBASE is in shared memory
pqConfig.tmpBase = (uint32_t *) PowerQuadTemp;
PQ_SetConfig(DEMO_POWERQUAD, &pqConfig);
2. SRAM4 is checked that is not used
Linker output:
" SRAM_4: 0 GB 16 KB 0.00%"
Map file:
SRAM_4 0x20040000 0x00004000 xrw
3. Only user accessible addres are used 0xE000_3000 and 0xE000_1000
// First 4K not used as is reserved for PowerQuad
#define EXAMPLE_PRIVATE_RAM_0 ((float *)0xE0000000)
// Memory that can be used by user, divided in 4K blocks
#define EXAMPLE_PRIVATE_RAM_1 ((float *)0xE0001000)
#define EXAMPLE_PRIVATE_RAM_2 ((float *)0xE0002000)
#define EXAMPLE_PRIVATE_RAM_3 ((float *)0xE0003000)
// Move first data table that will be next moved back to buffer and compared, placed from 4kB of memory
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_DATA_LEN / 16, 0), 1.0f, s_firstData,
EXAMPLE_PRIVATE_RAM_1);
PQ_WaitDone(DEMO_POWERQUAD);
// Move second data table in private PowerQuad memory, placed from 12kB of memory
// If this block is commented, example works, no overwrite
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_DATA_LEN / 16, 0), 1.0f, s_secondData,
EXAMPLE_PRIVATE_RAM_3);
PQ_WaitDone(DEMO_POWERQUAD);
// Move back first data table from memory placed form 4kB of memory
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_DATA_LEN / 16, 0), 1.0f , EXAMPLE_PRIVATE_RAM_1,
PowerQuadOutput);
PQ_WaitDone(DEMO_POWERQUAD);
Example move two different tables into two separated places of PowerQuad private memory EXAMPLE_PRIVATE_RAM_1 and EXAMPLE_PRIVATE_RAM_3. Next move back first table EXAMPLE_PRIVATE_RAM_1 to other shared buffer and compare that is the same as saved.
This example works if use of "EXAMPLE_PRIVATE_RAM_3" is commented. Moved tables are the same, but if add moving of second table to EXAMPLE_PRIVATE_RAM_3 this fails.
If any more question please ask.
I add project in attachment, maybe this will help investigate problem.
Best Regards,
Mateusz Litwin
Hi,
I think there are four base address, which define the memory allocation OUTBASE,INABASE, INBBASE,TMPBASE, they are configured by application code explicitly, but they can not be overlapped, in the PQ_GetDefaultConfig(), the TMPBASE is set up in 0xE000_0000, so user can not use the space.
Hope it can help you
BR
Xiangjun Rong
void PQ_GetDefaultConfig(pq_config_t *config)
{
config->inputAFormat = kPQ_Float;
config->inputAPrescale = 0;
config->inputBFormat = kPQ_Float;
config->inputBPrescale = 0;
config->outputFormat = kPQ_Float;
config->outputPrescale = 0;
config->tmpFormat = kPQ_Float;
config->tmpPrescale = 0;
config->machineFormat = kPQ_Float;
config->tmpBase = (uint32_t *)0xE0000000U;
}
Hi,
1. I checked SDK example "lpcxpresso55s69_powerquad_fir_fast" and there is example of use memory 0xE0001000 to accelerate computation and TMPBASE is set to 0xE0000000.
/*
* Power Quad driver uses the first 4K private RAM, the RAM starts from 0xE0001000
* could be used for other purpose.
*/
#define EXAMPLE_PRIVATE_RAM ((void *)0xE0001000)
.
.
.
.
.
.
/*
* Fast method
*
* The input data B is convert and saved to private RAM, thus the PQ could
* fetch data through two path. The input data B is converted to float format
* and saved to private ram.
*/
pqConfig.inputAFormat = kPQ_Float;
pqConfig.inputAPrescale = 0;
pqConfig.inputBFormat = kPQ_Float;
pqConfig.inputBPrescale = 0;
pqConfig.outputFormat = kPQ_Float;
pqConfig.outputPrescale = 0;
pqConfig.tmpFormat = kPQ_Float;
pqConfig.tmpPrescale = 0;
pqConfig.machineFormat = kPQ_Float;
pqConfig.tmpBase = (uint32_t *)0xE0000000;
PQ_SetConfig(DEMO_POWERQUAD, &pqConfig);
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_FIR_TAP_LEN / 16, 0), 1.0, tap,
EXAMPLE_PRIVATE_RAM);
PQ_WaitDone(POWERQUAD);
2. Comment in this example also points that only 4K of private RAM is used by TMPBASE and rest can be used by user.
3. If I use only 0xE0001000 private memory, example works OK despite TMPBASE is set to 0xE0000000, but if use also 0xE0003000 this brakes.
4. I also test method with changing TMPBASE to shared memory buffer and this not help.
Best Regards,
Mateusz Litwin
Hi, Mateo
The AE team said that the address space 0xE000_0000 to 0xE000_3FFF are shared the same memory cell with SRAM4 0x2004_0000 to 0x2004_3FFF, pls check if you use the SRAM4
BR
Xiangjun Rong
Hi,
I have tried to fill the private memory from 0xE000_0000 to 0xE000_3FFF with 0xFFFF_FFFF, after the operation is over, then read the private memory, I can not read them correctly.
If I fill the SRAM4, I can write and read correctly.
In conclusion, the private memory can not be used to save application data.
In the an12383, the private memory from 0xE000_0000 to 0xE000_3FFF are used for temporary only and it is assigned to PowerQuad->TMPBASE register.
BTW, the private memory is mentioned in neither reference manual not data sheet. Pls do not use the memory unless it is ussed as TMPBASE.
Hope it can help you
BR
XiangJun Rong
1. PowerQuad is mapping private memory from 0xE000_0000 to 0xE000_3FFF address space to SRAM4 0x2004_0000 to 0x2004_3FFF address space and use SRAM in interleave way. I checked that data is correctly saved in SRAM4. I add screenshots below.
If you try to use memory from 0xE000_0000 to 0xE000_3FFF address space from ARM you use different peripheral.
2. I add project which points whats the problem and work correctly.
3. https://community.nxp.com/t5/LPC-Microcontrollers/Use-Powerquad-Private-RAM/td-p/1069956 points that: "We have powerquad_fir_fast SDK demo include powerquad private ram operation that we recommend. Please check it." Than I can I use this memory as pointed by other answer. I use exactly matrix scale operation to access private PowerQuad RAM.
4. Whats the point of articles like https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/tech-days/319/1/AMF-SMH-T3513_Hands-on_3... this is LAB from NXP that FIR with USE of the PRIVATE RAM is faster.
EDIT:
- This LAB from NXP points that private RAM can be used for improve FIR, convolve, corelate opertation. They explicit points that placing the input data B. Private RAM can be used not only for TMPBASE.
5. I add many application notes like https://www.nxp.com/docs/en/application-note/AN12383.pdf or https://www.nxp.com/docs/en/application-note/AN12282.pdf that points that private RAM of PowerQuad could be used by user. What the point of share this information if this should be not used by user?
Hi xiangjun_rong,
As SRAM4 is explicit reserved for PowerQuad, I don't use this memory.
I can prepare and provide modified SDK example "lpcxpresso55s69_powerquad_fir_fast", which causes problems on my board, too fully investigate if needed.
Best Regards,
Mateusz Litwin
Also AN12282 and AN12383 points that TMPBASE is used only for FFT or Matrix Inversion.
I don't use any of this operation. Then TEMP area should not be used.
Hi,
Thanks for response. I took this information from several places, examples are below:
- AN12383
AN12383
- AN12282
AN12282
- SDK example "lpcxpresso55s69_powerquad_fir_fast"
/*
* Power Quad driver uses the first 4K private RAM, the RAM starts from 0xE0001000
* could be used for other purpose.
*/
#define EXAMPLE_PRIVATE_RAM ((void *)0xE0001000)
I use this memory for performance reasons and it works very well till 8KB boundry (0xE0002000), but this notes says that should be possibility to use more PowerQuad private memory.
If there is need for more information to investigate this, please ask.
Best Regards,
Mateusz Litwin
Hi,
Unfortunately, I have not found out where the memory from address 0xE000_3000 or 0xE000_1000 are, can you show any documentation which tell us that user can use the memory as private memory?
I have checked the UM11126.pdf, I have not seen it.
If you declare an array for example resultArray[]; and call the scale function, what is the result?
float resultArray[N];
static void PQ_FIRFloatExample(void)
{
uint32_t i;
pq_config_t pqConfig;
PQ_GetDefaultConfig(&pqConfig);
PQ_SetConfig(DEMO_POWERQUAD, &pqConfig);
// Move data that will be moved back to buffer and compared, placed from 4kB of memory
PQ_MatrixScale(DEMO_POWERQUAD, POWERQUAD_MAKE_MATRIX_LEN(16, EXAMPLE_FIR_TAP_LEN / 16, 0), 1.0f, s_firTaps,
resultArray);
PQ_WaitDone(DEMO_POWERQUAD);
}
Sorry if I misunderstand you
BR
XiangJun Rong