Hi
i did some speed tests, using "FLEXSPI hyperflash example" and my EVK board.
in the example at line 500 i found the comment
/* flexspi clock 332M, DDR mode, internal clock 166M. */
and the following code line
memcpy(s_hyperflash_read_buffer, (void *)(FlexSPI_AMBA_BASE + EXAMPLE_SECTOR * SECTOR_SIZE), sizeof(s_hyperflash_read_buffer));
i put my LED_on(), LED_Off() macros berofe and after this memcpy function
and started my measurements.
this memcpy (512 Byte from QSPI) needs about about 34.8 us, thats very slow for the "fast" 166MHz QSPI.
I expected following: 166 Mhz for QSPI => 332 MHz in DDR mode and 512 Byte Data and adding some overhead data (QSPI cmd, addr, etc.) => 2 us!!!
How can i speed the communication?
When trying to understand the read behaviour from Flash, please remember that this is a cached MCU - where a hit in the cache will not of course perform a flash access at all. Any read will first check the cache contents, only if a miss occurs will the a sequence begin to perform a read.
I suggest taking a look at the User Manual for this part, specifically 'AHB read access to Flash' to better understand what is occurring.
Yours,
MCUXpresso IDE Support
Hello,
I hope someone can help me understanding the following:
In hyperflash examples, Is it that reading is done by pointer de-reference (like memcpy), but writing done through specific driver (can't use dereference for writing) ?
Thank you,
ranran
hi
i check the clock settings
CLOCK_InitUsb1Pfd(kCLOCK_Pfd0, 26); /* Set PLL3 PFD0 clock 332MHZ. */
CLOCK_DisableClock(EXAMPLE_FLEXSPI_CLOCK);
CLOCK_SetDiv(kCLOCK_FlexspiDiv, 0); /* flexspi clock 332M, DDR mode, internal clock 166M. */
CLOCK_EnableClock(EXAMPLE_FLEXSPI_CLOCK);
FLEXSPI_Enable(EXAMPLE_FLEXSPI, true);
i = CLOCK_GetFreq(kCLOCK_Usb1PllPfd0Clk);
PRINTF("PFD Clock=%lu \r\n", i);
i = CLOCK_GetFreq(kCLOCK_Usb1PllPfd0Clk) / (CLOCK_GetDiv(kCLOCK_FlexspiDiv) + 1U);
PRINTF("With Div=%lu \r\n", i);
=> Output from the example project
FLEXSPI hyperflash example started!
Entering the ASO mode
Found the HyperFlash by CFI
Erasing Serial NOR over FlexSPI...
Erase data - successfully.
PFD Clock=332307684
With Div=332307684
Program data - successfully.
hi,
what do mean with
1.) "additional delays happen"
--------------------------------------
the code is in itcm, i also checked the code with IDA (Interactive Disassembler) there are no "delays". memcpy is in itcm
Here the asm code
MENK_Run:
ER_m_text:000034FE LDR R0, =0x401B8000
ER_m_text:00003500 LDR R0, [R0]
ER_m_text:00003502 BIC.W R0, R0, #0x400
ER_m_text:00003506 LDR R1, =0x401B8000
ER_m_text:00003508 STR R0, [R1]
ER_m_text:0000350A ASRS R2, R1, #0x15
ER_m_text:0000350C LDR R1, =0x61940000
ER_m_text:0000350E LDR R0, =menk_read_buffer
ER_m_text:00003510 BL __aeabi_memcpy8
ER_m_text:00003514 LDR R0, =0x401B8000
ER_m_text:00003516 LDR R0, [R0]
ER_m_text:00003518 ORR.W R0, R0, #0x400
ER_m_text:0000351C LDR R1, =0x401B8000
ER_m_text:0000351E STR R0, [R1]
__aeabi_memcpy8:
ER_m_text:00000694 EXPORT __aeabi_memcpy8
ER_m_text:00000694 __aeabi_memcpy8
ER_m_text:00000694 .............
There is no delay,
2.) "and used flexspi clock frequency"
------------------------------------------------
" Here the asm code
There is no delay, "
actually these asm codes consume some processor time and introduce delays,
they may be observed with oscilloscope.
Also may be useful to check clocks using sect.2.3. FlexRAM module-related clocks and clock gates
AN12077 Using the i.MX RT FlexRAM
hi,
the code (example code from NXP) is running in ICTM and needs 9,4 us for 512 Byte
/* flexspi clock 332M, DDR mode, internal clock 166M. */
memcpy(s_hyperflash_read_buffer, (void *)(FlexSPI_AMBA_BASE + EXAMPLE_SECTOR * SECTOR_SIZE), sizeof(s_hyperflash_read_buffer));
the memcpy uses ldmia and stmia asm instruction => ca. 128 (loads + stores + sub + cmp + brunch) => 128 * 5 instructions => 640 instructions
the cpu is running with 600 mhz.
so you think these 640 instructions needs more than 7 us?
not really. 640 instructions => ~1 us but not 7 us
i am using the example project that says 332M DDR mode.
Hi Christian
for better results recommended not to use debug mode and run image
from ITCM.
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Hi
i tried this already, i moved the important copy code to a separate c-file
void Testcopy(void)
{
__disable_irq();
DEBUGLED_On();
memcpy(menk_read_buffer, (void *)(FlexSPI_AMBA_BASE + 101 * SECTOR_SIZE), sizeof(menk_read_buffer));
DEBUGLED_Off();
__enable_irq();
}
and made changes to the scatter file.
The Testcopy function is in ICTM, my testbuffer is in ocm, etc. etc.
i checked the map file.
so the thing is, has anyone checked the speed with 332 mhz?
what are the right settings?
do you have an example code, that copies 512 byte in 2 us, as expected?
you can check with oscilloscope where additional delays happen
and used flexspi clock frequency.
Best regards
igor