What is the correct QSPI Endianness for K8x?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

What is the correct QSPI Endianness for K8x?

Jump to solution
5,178 Views
deniscollis
Contributor V

The MCU reference manual states that the default endianness for the QSPI peripheral is 64-bit little-endian.   However, with the QSPI peripheral configured for 64-bit little-endian, a core hard-fault condition ensues when attempting to execute code in sections located in external QSPI flash.  But the code runs perfectly with 32-bit little-endian configured. 

I doubt that this is the correct behavior.  Any ideas?

PS. I need to add that there is no bootloader.  The flash is programmed directly, using J-Link.

1 Solution
4,409 Views
deniscollis
Contributor V

Segger have responded with the following:

 

The algorithm when programming the QSPI flash sets the MCR as follows:

QSPI->MCR = 0
 | (1uL << 0) // Reset flash controller
 | (1uL << 1) // Reset AHP controller
 | (1uL << 2) // 32 bit LE
 | (1uL << 7) // 2x and 4x clocks are enabled supports both SDR and DDR instruction
 | (0uL << 14) // Do not allow external logic to disable QSPI
 | (1uL << 10) // Clear RX FIFO
 | (1uL << 11) // Clear TX FIFO
 | (0xFuL << 16)
 ;‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This is only done when programming the flash.
This will not survive a reset.

Note that the QSPI is set to 32-bit Little-endian (line 4).

This is not a problem if external flash is used exclusively for XiP, where the QSPI should be initialized as 32-bit LE.  It is also not a problem if the flash is used exclusively for data storage, where the QSPI should be initialized as 64-bit LE.  It is a problem when the flash is shared for XiP and data storage:  the fsl_qspi driver will not work correctly at 32-bit LE.  

Here is the solution:

First, conditionally swap word order on data writes.  

/*!
 * @brief local 64/32 Endian-aware version of QSPI_WriteBlocking (in fsl_qspi driver)
 * @note Swaps word order according to QSPI Endianness
 * For 64LE word order is not swapped.
 * For 32LE word order is swapped.
 */
void QSPI_My_WriteBlocking(QuadSPI_Type *base, uint32_t *buffer, size_t size)
{
    assert(size >= 16U);
    assert(size%8 == 0);

    uint32_t i = 0;

    uint8_t word_0 = 0, word_1 = 1;
    // XXX: word swap for 32-Bit Little-Endian
    if ((QuadSPI0->MCR & QuadSPI_MCR_END_CFG_MASK) == 
        QuadSPI_MCR_END_CFG(kQSPI_32LittleEndian))
    {
        word_0 = 1, word_1 = 0;
    }

    // XXX: write 64 bits (2 words) at a time
    for (i = 0; i < size / 8U; i++)
    {
        while (QSPI_GetStatusFlags(base) & kQSPI_TxBufferFull)
        {
            // Wait if buffer full
        }
        QSPI_WriteData(base, *(buffer+word_0)); // Add first word to the QSPI TX FIFO
        while (QSPI_GetStatusFlags(base) & kQSPI_TxBufferFull
        {
            // Wait if buffer full
        }
        QSPI_WriteData(base, *(buffer+word_1)); // Add second word to the QSPI TX FIFO
        buffer += 2;
    }
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Second, ensure that all LUT sequences that execute READ instruction (0x07), should read multiples of 8 bytes from the flash device.

Last, if necessary, ensure that results when reading flash device registers are word-swapped.  This is usually not necessary as most register reads return 1, 2, or 4 byte values, and so if the IP command in the LUT reads 8-bytes, then the register value is repeatedly sent by the device until the read cycles are complete – so the first 32-bits in the RX FIFO will be identical to the second 32-bits.  Be careful when reading results of some chip-specific Read-ID commands, as they return large amounts of data (consult the specific datasheet).

NOTE: Segger have identified this issue and will address the endianness/width setting in a future J-Link software version. 

View solution in original post

0 Kudos
Reply
20 Replies
4,410 Views
deniscollis
Contributor V

Segger have responded with the following:

 

The algorithm when programming the QSPI flash sets the MCR as follows:

QSPI->MCR = 0
 | (1uL << 0) // Reset flash controller
 | (1uL << 1) // Reset AHP controller
 | (1uL << 2) // 32 bit LE
 | (1uL << 7) // 2x and 4x clocks are enabled supports both SDR and DDR instruction
 | (0uL << 14) // Do not allow external logic to disable QSPI
 | (1uL << 10) // Clear RX FIFO
 | (1uL << 11) // Clear TX FIFO
 | (0xFuL << 16)
 ;‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

This is only done when programming the flash.
This will not survive a reset.

Note that the QSPI is set to 32-bit Little-endian (line 4).

This is not a problem if external flash is used exclusively for XiP, where the QSPI should be initialized as 32-bit LE.  It is also not a problem if the flash is used exclusively for data storage, where the QSPI should be initialized as 64-bit LE.  It is a problem when the flash is shared for XiP and data storage:  the fsl_qspi driver will not work correctly at 32-bit LE.  

Here is the solution:

First, conditionally swap word order on data writes.  

/*!
 * @brief local 64/32 Endian-aware version of QSPI_WriteBlocking (in fsl_qspi driver)
 * @note Swaps word order according to QSPI Endianness
 * For 64LE word order is not swapped.
 * For 32LE word order is swapped.
 */
void QSPI_My_WriteBlocking(QuadSPI_Type *base, uint32_t *buffer, size_t size)
{
    assert(size >= 16U);
    assert(size%8 == 0);

    uint32_t i = 0;

    uint8_t word_0 = 0, word_1 = 1;
    // XXX: word swap for 32-Bit Little-Endian
    if ((QuadSPI0->MCR & QuadSPI_MCR_END_CFG_MASK) == 
        QuadSPI_MCR_END_CFG(kQSPI_32LittleEndian))
    {
        word_0 = 1, word_1 = 0;
    }

    // XXX: write 64 bits (2 words) at a time
    for (i = 0; i < size / 8U; i++)
    {
        while (QSPI_GetStatusFlags(base) & kQSPI_TxBufferFull)
        {
            // Wait if buffer full
        }
        QSPI_WriteData(base, *(buffer+word_0)); // Add first word to the QSPI TX FIFO
        while (QSPI_GetStatusFlags(base) & kQSPI_TxBufferFull
        {
            // Wait if buffer full
        }
        QSPI_WriteData(base, *(buffer+word_1)); // Add second word to the QSPI TX FIFO
        buffer += 2;
    }
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Second, ensure that all LUT sequences that execute READ instruction (0x07), should read multiples of 8 bytes from the flash device.

Last, if necessary, ensure that results when reading flash device registers are word-swapped.  This is usually not necessary as most register reads return 1, 2, or 4 byte values, and so if the IP command in the LUT reads 8-bytes, then the register value is repeatedly sent by the device until the read cycles are complete – so the first 32-bits in the RX FIFO will be identical to the second 32-bits.  Be careful when reading results of some chip-specific Read-ID commands, as they return large amounts of data (consult the specific datasheet).

NOTE: Segger have identified this issue and will address the endianness/width setting in a future J-Link software version. 

0 Kudos
Reply
4,408 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

    Really thank you very much for the conclusion.

    If you have the other question in the future, welcome to let us know.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

   This week, I and my colleague work together on your issue.

   Now, we find these information may useful to you:

1. About the 64LE can't work, and 32LE can work problem.

   If you do the QSPI flash erase at first, you will find, your 32LE can program the Flash_FunctionB to the according QSPI flash correctly with JLINK and MCUXpresso IDE, but 64LE can't program that function to your external flash. When test it, please make sure you erase the according flash sector at first.

  My colleague looked at the QuadSPI registers at the start of main. At this point they have been configured by the J-Link flashloader, and he found that the END_CFG value is set to 0x1 (32LE). This means, the JLINK driver already use the 32LE, it doesn't use the 64 LE, that's why the SDK code with 64 LE can't program the function to the external qspi flash directly. The phenomena is enter hardfault, because the according code even didn't write correct to the external flash if you erase it at first. Even you didn't erase it, the code which has been write to the external flash is the 32LE format, you use the 64LE to read the according code will have problems, can't recognize it, so it enter hardfault.

The mismatch between the endianness configuration between the flash driver and the SDK example are the root cause of the problem.

2.  32LE swap problem.

The simplest fix is probably to swap the buffer data before programming it in the sample code.The SDK/application buffer handling is easier to implement because the source code for the SDK is available.

We think, the segger driver also need to make the change.

My colleague will contact segger about this problem, but we think to get it prioritized, you also can contact Segger.

The above are our this week's summarize for your question.

Thanks a lot for your patience.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

4,409 Views
deniscollis
Contributor V

Hi Kerry,

How does the J-Link set the QSPI configuration?  Is there anything we can do so that the endian is set correctly before the J-Link tries to program the QSPI Flash?  (Currently the state of BCA and QCB is factory-default.)

Thanks you for keeping with the problem, and finding the cause.  This is greatly appreciated!  I will also contact Segger.

Best,

Denis

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

There is a scripting mechanism for J-Link that is supported by Eclipse/MCUXpresso.  Is it possible that all that is needed is a J-Link script that sets the QSPI Endian bits in the QuadSPI0->MCR register?

Debug_JLink_Script.jpg

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

I wrote a JLink Script, which I triggered from the debugger, as above, which sets the endian bits in the MCR register directly before programming the flash.  However, this has no effect because, when I check the MCR register beforehand, it is set to 0xFFFFFFFF.  In other words, directly after the J-Link resets the target, the endian bits are already set to 64LE.

But, if I dump the MCR register as execution starts -- a breakpoint at the first statement in main() -- I see the MCR register has now been changed to 0x000F0084 (32LE).  This is before QSPI has been initialized.

Question: What happens between MCU Reset and QSPI_Init() that changes the MCR register? 

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis Collis,

   About the JLINK script, I also search it in the IDE, but I didn't find it.

   I think this already contains in the JLINK driver, take an example, if you don't use the IDE download the code, just use the JLINK driver software, you also can't find the low level JLINK driver code or script.

  About the MCU reset relationship with QSPI_INIT, do you find this character? If you download the code, the first time, enter the main, you can find it is 32LE:

pastedImage_1.png

But, if you use the software to reset the chip in the IDE:

pastedImage_2.png

This time, you can't see the register data,

pastedImage_3.png

even you check the MCR address, you also can't see it.

pastedImage_5.png

But if you run the initialize code:

pastedImage_6.png

pastedImage_7.png

So, after reset, the MCR is:

pastedImage_8.png

So, the first time enter the debug mode, the QSPI should be initialized by the JLINK driver, that's why you read out it as 32LE in main.But if you reset it again, all the data will be return back, you need to initialize the qspi with the code again.

Wish it helps you!

Do you contact with the SEGGER? If you get the reply, please kindly let me know.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

4,409 Views
deniscollis
Contributor V

Hi Kerry,

Here's how to insert a J-Link Script:

Launch_config.jpg 

mod_config.jpg

/*********************************************************************
----------------------------------------------------------------------
File    : NXP_MK8x_QSPI.JLinkScript
Purpose : J-Link target setup file for NXP K81 and K82 QSPI XiP
Author  : Denis Collis, IPS Group Inc.
---------------------------END-OF-HEADER------------------------------
*/

/*********************************************************************
*       InitTarget
*/
//int HandleBeforeFlashProg(void) {
int AfterResetTarget(void) {
 
     U32 U32_DATA;
     int r;
     
    JLINK_SYS_Report("J-Link script: NXP_MK8x_QSPI.JLinkScript");
     JLINK_SYS_Report("J-Link script: Halting target");
     JLINK_TARGET_Halt();
     
     U32_DATA = JLINK_MEM_ReadU32(0x400DA000); // 0x400DA000 = Address of QuadSPI0_MCR Register on K8x MCU 
    JLINK_SYS_Report1("J-Link script: Reading QSPI->MCR: ", U32_DATA);
     
     U32_DATA |= 0x0000000C; // 0x0000000C is 64LE mask 
    JLINK_SYS_Report1("J-Link script: Writing QSPI->MCR: ", U32_DATA);
     r = JLINK_MEM_WriteU32(0x400DA000, U32_DATA);
     if (r <0) {
          JLINK_SYS_Report("J-Link script: !!Unable to set QuadSPI_MCR to 64bit Little-endian!!");
     }
     else {
          JLINK_SYS_Report("J-Link script: QuadSPI_MCR set to 64bit Little-endian");
     }
     JLINK_SYS_Report("J-Link script: Resetting target");
     JLINK_TIF_ActivateTargetReset();
     JLINK_TIF_ReleaseTargetReset();
     return r; 
}
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

    Thanks for your sharing, I know where to add the script.

   But, from our test result, the 32LE should be attached in the JLINK driver, it is not added in the MCUXpresso IDE, that is why we can't find it, but we don't have the Segger JLINK driver code about it, so it is need to checked it from the Segger side, do you have any feedback from the Segger?


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

Hi Kerry,

You are right. The script has no effect, and so I think the JLink firmware needs to be fixed.   I have not yet had any feedback from Segger.  In the meantime I have ordered the P&E Micro USB Multilink Debug Probe, which I think is more commonly used with Kinetis MCUs.  This should verify the problem.

Thanks,

Denis

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

    Thanks for your updated information.

   We also haven't had any feedback from segger, any updated information on my side, I will let you know ASAP.

  But if you get any information, please also let me know!


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

Hi Kerry,

I have had a response from Segger, who have escalated the support issue.  However, I tried some new scenarios that indicate that this is not a J-Link problem, and have notified Segger.

I loaded the DAPlink debug application onto the OpenSDA circuit of a FRDM-L82F.  Now, when I download load and debug using MCUXpresso's LinkServer, I get the same word-swap problem that I had before with the J-Link.

This issue is delaying my project.  It may be a while until we know what's really going on, so in the meantime I will write a work-around.

Have a great weekend,

Denis

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

   Today, I use the FRDM-K82 chip test your project, and our own SDK project, I meet the same problem as you.

   1. kQSPI_64LittleEndian

   If define a user function in the QSPI flash area or the according aliased area from 0X0400_0000-0X07FF_FFF.

   Don't call that function in debug mode, it's no problem, the according address also have the function data, but if call it in the main, I will meet the reset problem. I relocate the function to aliased area, just to make sure the code can be debugged. Because Debug access is available only in the aliased QuadSPI memory region.

  64 bit LE write have no problem, no 32bit word-pairs transposed problem.

pastedImage_1.png

pastedImage_1.png

pastedImage_1.png

Printf result also can find the write data is the same as read.

But after calling the function located in 0x68020000:

__TEXT(Flash3) //Flash3 //Flash5
int Flash_FunctionB(int myvar)
{
    PRINTF("output: %d\n", myvar);
    myvar++;
    PRINTF("output: %d\n", myvar);
    return myvar;
}

 Flash_FunctionB(0x1);

The code will reset, this is the printf result:

pastedImage_1.png

   2.kQSPI_32LittleEndian

    The same result as you.

    Call the function located no matter in QSPI area or the aliased area, no reset or hardfault problem.

   But after the write, the result really have the 32bit word-pairs transposed problem.pastedImage_2.png

pastedImage_1.png

This is the printf result with 32bitLE:

pastedImage_2.png

Function Flash_FunctionB located in 0x68020000, can be called, and works OK.

But the read out data is not the same as the write data. It is really have 32bit swapped  problem.

So, now I can reproduce your problem on my side, but I still need more time to find the detail problem.

Please give me more time,any updated information on my side, will let you know.

Thanks a lot for your understanding.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

Hi Kerry,

Thanks for confirming the issue.  On my side I have changed the way I initialize the QSPI. I now use the fsl_qspi driver functions – QSPI_Init() and QSPI_SetFlashConfig() – instead of directly setting the registers.  However, the problem remains.

I then started a new MCUXpresso project using a different NXP source:   demo_apps_and_examples/driver_examples/qspi/polling_transfer.   As before, reading/writing data from/to flash worked perfectly with kQSPI_64LittleEndian, but XiP execution failed with a core hard-fault.  And XiP worked perfectly with  kQSPI_32LittleEndian, but reading back previously written data had 32-bit word pairs swapped.

If someone can confirm that this is indeed a known issue, then I am happy to implement a work-around.

Thanks,

Denis

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

Do you try our SDK sample code for K81 at first?
Our sample code also use the 64 Little Endian.
sO, I highly suggest you try our official SDK code for K81 at first.
The SDK code can be downloaded from this link:
https://mcuxpresso.nxp.com
If you still have question about it, please kindly let me know.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

Hi Kerry,

I am using MCUXpresso with the official SDK.  I am also using the QSPI flash module  found in K81POSCR_MR2_20180427/pos/modules/common/

qspi_flash_module.c

qspi_flash_module.h

qspi_instructions.h

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

Just adding this from my tech support request, which may help others understand the problem:

  1. Hardware: K81 MCU part, and QSPI Flash part (either Micron MT25QL128 or Cypress S25FL128S)
  2. External flash is used for both XiP code, and raw data (file system).
  3. Parts are programmed from MCUXpresso via Segger J-Link.
  4. Data writes to flash are via Peripheral Bus Tx FIFO
  5. Data reads from flash are memory-mapped via AHBus
  6. 64-bit Little-Endian is default for QSPI peripheral on K81

With QSPI0->MCR |= kQSPI_64LittleEndian << 2 :

Data writes and reads match, but code execution ends in core hard fault.


With QSPI0->MCR |= kQSPI_32LittleEndian << 2 :

Code execution succeeds, but data writes and reads have 32bit word-pairs transposed.

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

   Could you also share some test problem results and debug pictures for 64bit and 32bit? I will check it on my side later.


Have a great day,
kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply
4,409 Views
deniscollis
Contributor V

I have just seen that, even if execution succeeds, you could encounter unexpected results with QSPI0->MCR |= kQSPI_32LittleEndian << 2  --- For example, a register read from the external part returns a word-swapped result.

Clearly, 64-Bit Little-Endian is correct for this MCU.  I need to figure out why the executable code is word-swapped, or perhaps the word-order was swapped when the J-Link wrote the code to the external flash?

0 Kudos
Reply
4,409 Views
kerryzhou
NXP TechSupport
NXP TechSupport

Hi Denis,

    Please give me some swapped data result in your side,  and the JLINK write and read result pictures.

   Tomorrow, I will test it on my side, please keep patience.


Have a great day,
Kerry

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply