Benedek Kupper

How to remap FlexRAM into a single data RAM area

Discussion created by Benedek Kupper on Jul 10, 2020
Latest reply on Jul 24, 2020 by Benedek Kupper

I'd like to share my struggles and eventual success on putting the fragmented default FlexRAM layout of the i.MX RT1010 together on to a single block.


The reason for doing this is that there's large third party libraries which require large chunks of RAM, so we need to utilize all available amount as data RAM.


So I went about reading the FlexRAM application notes, and tried to set the registers based on that description. That didn't work, so I checked the FlexRAM driver, and there's actually the FLEXRAM_AllocateRam() function in fsl_flexram.c, which does exactly what I need. What needs to be determined is, when to call this function. Since the application data already extended beyond the default DTCM size, the RAM needs to be remapped before the RAM variables are initialized at startup (otherwise the initial values would be written to nonexistent addresses). Therefore this call should be added to the strong SystemInitHook() definition, which is called through SystemInit(), from ResetISR(), right before the RAM data is initialized.


So I've added the FLEXRAM_AllocateRam() call to remap all 4 RAM banks to DTCM, and I've modified the linker memory layout to have all of the 128K RAM at SRAM_DTCM. At this point the project builds, but I'm unable to even step into the ResetISR(). It took me a while to realize the only reason that could lead to this: the first thing the CPU does before executing the ResetISR() is that it uses the previous word in memory (the first word of the interrupt vector table) to set the stack pointer to. Well, as the stack is placed at the very end of the RAM, it doesn't exist at reset, since the FlexRAM isn't remapped yet. The best thing I could do is to introduce a separate stack for startup only, and then copy and relocate the stack after the FlexRAM is remapped:


These are my changes in startup_<chipname>.c :

// TODO: make sure this is sized properly (no overflow until __set_MSP() call)
__attribute__((used, section(".StartupStack")))
void* startupStack[64];
void* const startupStackEnd = &startupStack[(sizeof(startupStack)/sizeof(startupStack[0]))];

__attribute__ ((used, section(".isr_vector")))
void (* const g_pfnVectors[])(void) = {
    // Core Level - CM7
    startupStackEnd,                       // The initial stack pointer
   [inside ResetISR()]
    // relocate stack to official position as it's now mapped
        unsigned int msp;
        unsigned int* currentStack = startupStackEnd, *newStack = &_vStackTop;
        __asm volatile ("MRS %0, msp" : "=r" (msp) );
        while (currentStack > (unsigned int*)msp)
            *newStack = *currentStack;
        __asm volatile ("MSR msp, %0" : : "r" (newStack) : );
    // Copy the data sections from flash to SRAM.

By adding the ".StartupStack" section to .data in the managed linker script, the startup stack ends up at the very beginning of the DTCM (assuming there's no other custom sections preceding it), which exists at startup as DTCM has 1 bank allocated by default.


At this point, the project would run. Sometimes. It definitely didn't run if optimizations are disabled, and with maximum optimizations it would stop working if recompiled for some trivial reason. Otherwise it would end up in HardFault due to memory access exception(s). When I tried to debug through FLEXRAM_AllocateRam(), the strangest thing happened: the same image wouldn't throw fault when stepping through it. With no optimizations, it would always fault, so I stepped though the disassembly, and realized the error in this code:


The register is first cleared, then there's a function call, the result of which is then written to the register. The method of clearing these registers, and then writing them as a separate access, leaves an execution window open, where the RAM is in undefined state. By changing this and similar lines to perform the clear and write in a single register access, the faults no longer occur:


I've attached the entire fsl_flexram patch.


Edit: One more thing, the MPU configuration has to be adjusted to the new RAM space as well to allow unaligned accesses through the whole RAM region.


This is all it takes to achieve something that really should be made simpler: map all onboard RAM for data use.


I would have greatly appreciated it if such example was already available in the NXP SDK, as this was a long and painful process. We have also contacted NXP FAE but got no follow-up on my initial questions. So my conclusion is: don't expect more than what you pay for with this chip.