How to use FLEXSPI_SoftwareReset in own _ramfunc without hard fault?

mitterha · ‎04-01-2020

Hello,

I am using IAR with no optimization enabled on a RT1020 evaluation board. I want to call e.g. FLEXSPI_SoftwareReset from fsl_flexspi with XIP enabled in my own code and have trouble with it.

In the example project flexspi_nor_polling_transfer the complete fsl_flexspi.o and flexspi_nor_flash_ops objects get copied to RAM by icf file settings.

initialize by copy {
readwrite,
/* Place in RAM flash and performance dependent functions */
object flexspi_nor_flash_ops.o,
object fsl_flexspi.o,
section .textrw
};

I do not want to place my complete module in RAM that way because it would waste too much RAM. Therefore I use the keyword _ramfunc to mark only functions I want to keep in RAM. If I now call FLEXSPI_SoftwareReset (which gets placed in RAM by "initialize by copy") in my own _ramfunc it generates a hard fault.

I think it has something to do with the definition of the FLEXSPI_SoftwareReset function

static inline void FLEXSPI_SoftwareReset(FLEXSPI_Type *base)
{
    base->MCR0 |= FLEXSPI_MCR0_SWRESET_MASK;
    while (base->MCR0 & FLEXSPI_MCR0_SWRESET_MASK)
    {
    }
}

With no enabled optimization in IAR the function does not get inlined and I think the linker links the function call to the function placed in flash because if I copy the code inside the function call to my _ramfunc everything works fine.

The part of the map file without optimization looks like this

    845                FLEXSPI_SoftwareReset(psInterfaceData->peripheralBase);
   \       0x70   0x6820             LDR      R0,[R4, #+0]
   \       0x72   0x....'....        BL       FLEXSPI_SoftwareReset

and will result in the HardFault

With optimization it looks like the following

    845                FLEXSPI_SoftwareReset(psInterfaceData->peripheralBase);
   \       0x3E   0x9800             LDR      R0,[SP, #+0]
   \       0x40   0x6801             LDR      R1,[R0, #+0]
   \       0x42   0xF041 0x0101      ORR      R1,R1,#0x1
   \       0x46   0x6001             STR      R1,[R0, #+0]
   \                     ??FlexSpiFlashInit_2: (+1)
   \       0x48   0x6801             LDR      R1,[R0, #+0]
   \       0x4A   0x07C9             LSLS     R1,R1,#+31
   \       0x4C   0xD4FC             BMI.N    ??FlexSpiFlashInit_2

which is the inlined FLEXSPI_SoftwareReset function and there will be no hard fault.

How does NXP suggest to use the function without getting a hardfault and without placing the complete module in RAM via linker file?

Kind regards,

Stefan

mjbcswitzerland · ‎04-10-2020

Stefan

Since you only require a small piece of code in RAM a simple method (which is then portable across multiple IDEs and requires no linker script changes) is to step the code as it is in flash to see its disassembled form (or activate the IAR Assembler mnemonics listing output to get it in a file). Make sure that the code doesn't use any PC relative addressing (this should be OK in the code that you show since it should use addressing relative to the pointer that is passed to the function (and this will be passed in r0, which is standardised).

Then write the small routine as const assembler code - here is an example of the flashing routine I use in SRAM for Kinetis code:

static unsigned short fnFlashRoutine[] = {                               // to avoid potential compiler in-lining of the routine (removing position independency) the machine code is used directly
    0x2180,    // MOVS   r1,#0x80                                           load the value 0x80 (command complete interrupt flag) to register r1
    0x7001,    // STRB   r1,[r0,#0x00]                                      write r1 (0x80) to the passed pointer location (r0)
    0x7801,    // LDRB   r1,[r0,#0x00]                                      read back from the same location to r1
    0x0609,    // LSLS   r1,r1,#24                                          shift the register content by 24 bits to the left so that the command complete interrupt flag is at bit 31
    0xd5fc,    // BPL    -4                                                 if the command complete interrupt flag bit is '0' (register content is not negative value) branch back to read its value again
    0x4770     // BX     lr                                                 return from sub-routine
};
‍‍‍‍‍‍‍‍

When you initialise you can copy this to RAM in the form of a Thum2 call and then execute it at any time by calling this function. Here is a further example of how to enter the above flashing code which does it . Therefore you will only need to change the actual assembler values and you will have a multi IDE solution with no maintenance work even if you move to different development tools.

static void (*fnRAM_code)(volatile unsigned char *) = 0;

if (fnRAM_code == 0) {                                               // the first time this is used it will load the program to SRAM
    #define PROG_WORD_SIZE 30                                        // adequate space for the small program
    int i = 0;
    unsigned char *ptrThumb2 = (unsigned char *)fnFlashRoutine;
    static unsigned short usProgSpace[PROG_WORD_SIZE] = {0};         // make space for the routine in static memory (this will have an even boundary)
    ptrThumb2 =  (unsigned char *)(((CAST_POINTER_ARITHMETIC)ptrThumb2) & ~0x1); // thumb 2 address
    while (i < PROG_WORD_SIZE) {                                     // copy program to SRAM
        usProgSpace[i++] = *(unsigned short *)ptrThumb2;
        ptrThumb2 += sizeof (unsigned short);
    }
    ptrThumb2 = (unsigned char *)usProgSpace;
    ptrThumb2++;                                                     // create a thumb 2 call
    fnRAM_code = (void(*)(volatile unsigned char *))(ptrThumb2);
}
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Call it with
fnRAM_code((volatile unsigned char *)FLEXSPI_BASE);

The use of machine code completely removes any risk of code changing with optimisation, in-lining or other such things that can potentially cause such routines to fail with compiler version and settings.

Regards

Mark

[uTasker project developer for Kinetis and i.MX RT]

View solution in original post

FelipeGarcia · ‎04-06-2020

Hello Stefan,

I have tried to replicate your issue by simply adding __ramfunc to FLEXSPI_SoftwareReset and it worked correctly with no hard fault. Please check my description below.

The only modification I made to the flexspi_nor_polling_transfer example was to modify FLEXSPI_SoftwareReset function as follows.

__ramfunc static inline void FLEXSPI_SoftwareReset(FLEXSPI_Type *base)
{
    base->MCR0 |= FLEXSPI_MCR0_SWRESET_MASK;
    while (0U != (base->MCR0 & FLEXSPI_MCR0_SWRESET_MASK))
    {
    }
}‍‍‍‍‍‍‍‍‍‍‍‍‍‍

I checked the map file and the function was relocated correctly in DTCM.

I debugged the example there was no hardfault. Could you please try this simply test to see if it works the same for you?

Best regards,

Felipe

-------------------------------------------------------------------------------
Note:
- If this post answers your question, please click the "Mark Correct" button. Thank you!

- We are following threads for 7 weeks after the last post, later replies are ignored
Please open a new thread and refer to the closed one, if you have a related question at a later point in time.
-------------------------------------------------------------------------------

mitterha · ‎04-07-2020

Hello Felipe,

thank you for your answer.

Yes it works the same for me. If I add __ramfunc to the NXP driver function it will be copied to internal RAM and not to a flash address.

I think it is not a good idea to change the NXP provided driver on my side, because if there are any SDK updates I will have to redo the changes. On the other side I don't think it is a good idea to just copy the whole fsl_flexspi.o object to RAM because it wastes some RAM.

Do you have another idea how to just copy the essential functions and data of fsl_flexspi.o to RAM instead of the whole object?

Kind regards,

Stefan

FelipeGarcia · ‎04-10-2020

Hello Stefan,

Unfortunately, I do not have any other input from my end. Maybe you could try to contact IAR Technical Support directly so see if they have any more suggestions.

Best regards,

Felipe

mjbcswitzerland · ‎04-10-2020

Stefan

Since you only require a small piece of code in RAM a simple method (which is then portable across multiple IDEs and requires no linker script changes) is to step the code as it is in flash to see its disassembled form (or activate the IAR Assembler mnemonics listing output to get it in a file). Make sure that the code doesn't use any PC relative addressing (this should be OK in the code that you show since it should use addressing relative to the pointer that is passed to the function (and this will be passed in r0, which is standardised).

Then write the small routine as const assembler code - here is an example of the flashing routine I use in SRAM for Kinetis code:

static unsigned short fnFlashRoutine[] = {                               // to avoid potential compiler in-lining of the routine (removing position independency) the machine code is used directly
    0x2180,    // MOVS   r1,#0x80                                           load the value 0x80 (command complete interrupt flag) to register r1
    0x7001,    // STRB   r1,[r0,#0x00]                                      write r1 (0x80) to the passed pointer location (r0)
    0x7801,    // LDRB   r1,[r0,#0x00]                                      read back from the same location to r1
    0x0609,    // LSLS   r1,r1,#24                                          shift the register content by 24 bits to the left so that the command complete interrupt flag is at bit 31
    0xd5fc,    // BPL    -4                                                 if the command complete interrupt flag bit is '0' (register content is not negative value) branch back to read its value again
    0x4770     // BX     lr                                                 return from sub-routine
};
‍‍‍‍‍‍‍‍

When you initialise you can copy this to RAM in the form of a Thum2 call and then execute it at any time by calling this function. Here is a further example of how to enter the above flashing code which does it . Therefore you will only need to change the actual assembler values and you will have a multi IDE solution with no maintenance work even if you move to different development tools.

static void (*fnRAM_code)(volatile unsigned char *) = 0;

if (fnRAM_code == 0) {                                               // the first time this is used it will load the program to SRAM
    #define PROG_WORD_SIZE 30                                        // adequate space for the small program
    int i = 0;
    unsigned char *ptrThumb2 = (unsigned char *)fnFlashRoutine;
    static unsigned short usProgSpace[PROG_WORD_SIZE] = {0};         // make space for the routine in static memory (this will have an even boundary)
    ptrThumb2 =  (unsigned char *)(((CAST_POINTER_ARITHMETIC)ptrThumb2) & ~0x1); // thumb 2 address
    while (i < PROG_WORD_SIZE) {                                     // copy program to SRAM
        usProgSpace[i++] = *(unsigned short *)ptrThumb2;
        ptrThumb2 += sizeof (unsigned short);
    }
    ptrThumb2 = (unsigned char *)usProgSpace;
    ptrThumb2++;                                                     // create a thumb 2 call
    fnRAM_code = (void(*)(volatile unsigned char *))(ptrThumb2);
}
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Call it with
fnRAM_code((volatile unsigned char *)FLEXSPI_BASE);

The use of machine code completely removes any risk of code changing with optimisation, in-lining or other such things that can potentially cause such routines to fail with compiler version and settings.

Regards

Mark

[uTasker project developer for Kinetis and i.MX RT]

mitterha · ‎04-13-2020

Thank you Felipe and Mark!