How to manage mmu in vybrid using DS5

cancel
Showing results for 
Search instead for 
Did you mean: 

How to manage mmu in vybrid using DS5

1,399 Views
Contributor III

I have developed an mqx based audio application using vybrid tower system.I am using DS5 and Vybrid tower system for my development. I have loaded application to vybrid using ddr.scf scatter file from MQX tool kit. It works fine.

I have flashed mqx based audio application to QuadSPI serial nand flash using quadspi_load application available in the vybrid sample code. There is no DDR3 RAM In Our target platform based on Vybrid tower. SO I want to run my application completely from Internal SRAM.When I booted my audio application from QuadSPI nand flash, performance of my application degraded compared to DDR version of application(Which uses ddr.scf from MQX tool kit).

I wanted to use complete 1.5 MB of internal SRAM including OCRAM-gfxRAM for my application.

How to add OCRAM-gfxRAM to scatter file to use in my application.?

Where should I modify to enable and configure the mmu?

Do I need to do any configuration to increase performance of my application.

How to make SRAM cachable?

DO I need to add any extra code in my application to enable and Disable D-Cache and I-Cache?

BR/-

NIhad

Labels (4)
13 Replies

9 Views
NXP Employee
NXP Employee

You can start moving some of your load addresses for data sections to 0x3f4XXXXX and keep others in the 0x3f0XXXX section, like:

#define DATA_BASE_ADDR_START        0x3f470000

#define DATA_BASE_ADDR_END          0x3f47ebb0

#define DATA_SIZE                  (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

#define DATA_SHARED_START          0x3f440000

#define DATA_SHARED_END            0x3f44fff0

#define RESERVED_BASE_ADDR_START    0x3f07ebf0

#define RESERVED_BASE_ADDR_END      0x3f07eff0

#define RESERVED_SIZE              (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

Take a look at http://infocenter.arm.com/help/topic/com.arm.doc.dui0474c/DUI0474C_using_the_arm_linker.pdf

Chapter 8 Using scatter files have a good explanations, take a look at Images with a complex memory map section.

The trick is to make the difference between the LOAD section which in your case will be the QSPI address range and the Execution section that will be split between SYSRAM and gfxRAM.

0 Kudos

9 Views
Contributor III

Hi,

#define CODE_BASE_ADDR_START       0x20000800

#define CODE_BASE_ADDR_END          0x2002FFF0

#define CODE_SIZE                   (CODE_BASE_ADDR_END - CODE_BASE_ADDR_START)

;gfxRAM space

#define DATA_BASE_ADDR_START    0x3f400000

#define DATA_BASE_ADDR_END       0x3f47ebb0

#define DATA_SIZE               (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

;sysRAM1 shared memory

#define DATA_SHARED_START       0x3f040000

#define DATA_SHARED_END          0x3f07FFF0

My application is written for A5 core only. I tried to move my complete data section to gfx-SRAM and tried to use sysSRAM1 for shared memory as shown above. But system is not booting. But when I used as shown below system started booting. but performance is bad. I am not getting proper audio playback. I am using ds-5 starter kit for vybrid. Will it limit any kind of configuration settings.

Also Some one pointed(freescale community) out that By default MQX disables MMU and data cache. Performance degradation is due to this. How to Enable MMU and D-cache?

#define CODE_BASE_ADDR_START       0x20000800

#define CODE_BASE_ADDR_END          0x2002FFF0

#define CODE_SIZE                   (CODE_BASE_ADDR_END - CODE_BASE_ADDR_START)

;sysRAM0 shared memory

#define DATA_BASE_ADDR_START    0x3F000000

#define DATA_BASE_ADDR_END       0x3F03FFF0

#define DATA_SIZE               (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

;sysRAM1 shared memory

#define DATA_SHARED_START       0x3F040000

#define DATA_SHARED_END          0x3F07FFF0

I am Attaching memory map file. and scatter file for your reference. Iam using a message queue for my audio buffers.

msg_pool = _msgpool_create(13385, 12, 1, 0);

BR-

Nihad

0 Kudos

9 Views
Senior Contributor IV

Abdul,

This is excerpt from MQX 4.0.1 init_bsp.c file for twrvf65gs10_a5 BSP.:

    if (_mqx_monitor_type == MQX_MONITOR_TYPE_NONE)
    {
        /* Enable MMU and L1 cache */
        /* alloc L1 mmu table */
        L1PageTable = _mem_alloc_align(MMU_TRANSLATION_TABLE_SIZE, MMU_TRANSLATION_TABLE_ALIGN);
        /* None cacheable is comon with strongly ordered. MMU doesnt work with another init configuration */
        _mmu_vinit(PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_1MB) | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_STRONG_ORDER), (pointer)L1PageTable);
        /* add region in sram area */
        _mmu_add_vregion((pointer)__INTERNAL_SRAM_BASE, (pointer)__INTERNAL_SRAM_BASE, (_mem_size) 0x00100000, PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_1MB) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_CACHE_NON)   | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL));
        /* add cached region in ddr area */
        _mmu_add_vregion((pointer)__EXTERNAL_DDRAM_BASE, (pointer)__EXTERNAL_DDRAM_BASE, __EXTERNAL_DDRAM_SIZE, PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_1MB) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_CACHE_WBNWA) | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL));
         _mmu_venable();

        _DCACHE_ENABLE();
        _ICACHE_ENABLE();
    }

There are two _mmu_add_vregion() calls. One call makes internal sysRAM non cacheable (PSP_PAGE_TYPE_CACHE_NON) and DDRAM cacheable (PSP_PAGE_TYPE_CACHE_WBNWA). You may either use MQX BSP cloning wizard to create new BSP and edit init_bsp.c file, or modify existing BSP and rebuild it.

Looks like you are executing code from QSPI? QSPI is not included by MQX in D-cacheable piece of address map. I told you previously that I-cache caches all code instructions, but doesn't cache constants that are stored in code memory and accessed with CPU load instructions (D-cache). DS-5 puts to and uses a lot of constants in code memory. You may try including QSPI code area in D-cacheable regions list and compare execution speeds.

9 Views
Contributor III

Hi,

I tried to edit init_bsp.c file like below

/* add region in sram area */

        _mmu_add_vregion((pointer)__INTERNAL_SRAM_BASE, (pointer)__INTERNAL_SRAM_BASE, (_mem_size) 0x00100000, PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_1MB) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_CACHE_WBNWA)   | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL));

But my application hangs after booting. Also I tried to add cache region from my main function using _mmu_add_vregion. But application hangs Also if I try to move DATA_BASE_ADD from sysRAM0 to gfxRAM application will not boot.

About my application:

It has to receive and execute command received over ethernet from client application and stream audio from vybrid to my ethernet client application sitting in other device. I am sending PCM samples.One time I has to send 13372 bytes of PCM samples. So I created a message pool to push audio data from my capture task. Also there is another task which will pull a audio data of size 13372 and sent over ethernet in a loop. I am pasting snaps of my code which will create a msg queue for audio

#define AUDIO_PACKET_SZ 13372

typedef struct

{

    /* Message header */

    MESSAGE_HEADER_STRUCT HEADER;

    /* Data length */

    uint_32 LENGTH;

    /* Data */

    char DATA[AUDIO_PACKET_SZ];

   // uint_16 channels;        /* number of channels */

    //int_32 DATA[2][REC_BLOCK_SIZE];        /* PCM output samples [ch][sample] */

} REC_MESSAGE, _PTR_ REC_MESSAGE_PTR;

msg_pool = _msgpool_create(sizeof(REC_MESSAGE),8, 8, 0);

msg_ptr = (REC_MESSAGE_PTR) _msg_alloc(msg_pool);

//Here routine to fill msg_ptr with audio data

_msgq_send_queue(msg_ptr, _msgq_get_id(0, WRITE_QUEUE))

when I boot from QSPI and run in internal SRAM, The problem I am facing here is _msg_alloc(msg_pool) return NULL frequently. some time it allocate and return the msg pointer. but most of the time it fails. Due to this I am not getting continuous audio. That is why I said performance is bad. But when I run same application using DDR it works fine. Could you point out how to fix memory allocation problem.  Is there any way to create msg_pool in gfxRAM. There are nearly 512 KB memory available. But not able to use it.

BR/-

Nihad

0 Kudos

9 Views
Senior Contributor IV


There may be some reasons for why MQX <=4.0.1 BSP doesn't make SRAM cacheable. For example USB structs and buffers in RAM have to be flushed from cache after CPU initializes them, and then invalidated after USB transfer is done, before CPU reads those structs and buffers. Else something will most likely hang. Ethernet may have similar structs and buffers in RAM. DMA transfers is yet another reason to flush and invalidate.

It is good that MQX 4.0.2 provides smaller granularity of MMU pages. Availablity of 4k pages (compared to 1MB pages in earlier versions) makes it possible to dedicate part of 1MB (or 1.5MB) internal SRAM for not cached hardware buffers.

0 Kudos

9 Views
Contributor III

Hi all,

I fixed the issue. I appreciate all your support and input which I got while debugging the issue

We have to take 2 things If we want to use gfx-SRAM (0x3F400000 -0x3F47FFF0) in any application.

1)  Add gfx-SRAM region in scatter file and move some portion of your application into it. I have moved all KERNEL DATA.

#define RESERVED_BASE_ADDR_START0x3F400000
#define RESERVED_BASE_ADDR_END 0x3F47FFF0
#define RESERVED_SIZE          (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

RESERVED_START RESERVED_BASE_ADDR_START ALIGN 16

    {

        * (KERNEL_DATA_START)     ; start of kernel data

    }

  

     RESERVED_END RESERVED_BASE_ADDR_END

    {

        * (KERNEL_DATA_END)     ; end of kernel data      

    } 

2) Add gfx-SRAM region in mmu table. You can also make gfx-SRAM cacheable. If you miss to add in mmu table, application hangs when accessing gfx-SRAM space

_mmu_add_vregion((pointer) 0x3f400000, (pointer) 0x3f400000, (_mem_size) 0x0007ffff, PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_4KB) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_CACHE_WBNWA)   | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL));

    _mmu_venable();

    _DCACHE_ENABLE();

    _ICACHE_ENABLE();

BR/-

Nihad

0 Kudos

9 Views
Contributor III

Hi,

I have one question in QSPI FLash.  Please see my post here.

How to use QSPI-NAND flash for Non Volatile memory to store backup data while system configured for ...

BR/-

Nihad

0 Kudos

9 Views
NXP Employee
NXP Employee

Hi Nihad

What happen if you run your application from SRAM directly?

I mean if directly loaded in SRAM, no QSPI involved. Does it work fine?

One thing that I can think of (when relocating code from QSPI to SRAM) is that some sections like .bss or any other data section are not initialized correctly.

Maybe you can add some piece of code at your system initialization to zero out this sections.

I think it will be worthy to try first to load your app directly into SRAM.

This way you can debug step by step with DS-5 and check where the applications is hanging when split between sysRAM and gfxRAM.

It is hard to tell what is happening, if you could share an application where I can reproduce the issue, I will be happy to help you debugging.

I mean you can split all this audio code from your application or replace it for some dummy code and just left the _msg_alloc calls so the issue is reproducible.

Other way to try to debug this when booting from QSPI, is to place an infinite loop at some early point in your code, then you can connect with jtag. (Not sure if possible to just connect with DS-5 without loading image, just loding symbols). If possible then you can go step by step or placing breakpoints to see where the hang happens after manually going out of the infinite loop.

volatile int mydebug = 1

...

while(mydebug)

...

0 Kudos

9 Views
Contributor III

When I run my application from SRAM, initially

msg_ptr = (REC_MESSAGE_PTR) _msg_alloc(msg_pool); returns null but not continuous returning null. So the audio packet dropped intermittently and after some time it is allocating message. SO i will get proper audio playback after some time. failing of memory allocation is not constant in every debug cycle. some time it will fail to allocate msg say 400 time if I power cycle the tower board and run, some time it will be 16000 times if I continue load and debug without power cycle. it is not constant.

According to my scatter file

#define DATA_BASE_ADDR_START0x3f040000
#define DATA_BASE_ADDR_END  0x3f07fff0

KERNEL_DATA_START DATA_BASE_ADDR_START ALIGN 16

    ;KERNEL_DATA_START +0 ALIGN 16

    {

        * (KERNEL_DATA_START)     ; start of kernel data

        * (SRAM_POOL_START)

        * (UNCACHED_DATA_START)

    }

msg_pool = _msgpool_create(sizeof(REC_MESSAGE),8, 8, 0); create message pool in DATA_BASE_ADDR range. It always allocate message in system memory pool that is DATA_BASE_ADDR range. SO I thought my application not able to allocate memory in system memory pool. That is why audio is getting skipped. SO I tried to move KERNAL_DATA to a another region in gfx-SRAM

#define RESERVED_BASE_ADDR_START0x3F400000
#define RESERVED_BASE_ADDR_END  0x3F47FFF0
#define RESERVED_SIZE           (DATA_BASE_ADDR_END - DATA_BASE_ADDR_START)

RESERVED_START RESERVED_BASE_ADDR_START ALIGN 16

    {

        * (KERNEL_DATA_START)     ; start of kernel data

    }

   

     RESERVED_END RESERVED_BASE_ADDR_END

    {

        * (KERNEL_DATA_END)     ; end of kernel data       

    }  

But while bootup application hangs in rtcs_init();

In my main()  function I am calling rtcs_init(); to initalize ethernet stack to use in my application. This hang is happening only when I moved KERNAL_DATA to gfxSRAM.

When I debugged I found out that control is going to dispatch_gic.S

This happening while rtcs_init calls RTCS_msgq_send_blocked(message, RTCS_data_ptr->TCPIP_msg_pool) function from RTCS_cmd_issue(); function

Any idea why rtcs_init hangs when I move KERNAL_DATA to gfxSRAM.

Below paste the snaps of the code which execute before rtcs_init call in my main function

_mmu_add_vregion((pointer) 0x3f400000, (pointer) 0x3f400000, (_mem_size) 0x0007ffff, PSP_PAGE_TABLE_SECTION_SIZE(PSP_PAGE_TABLE_SECTION_SIZE_1MB) | PSP_PAGE_TYPE(PSP_PAGE_TYPE_CACHE_WBNWA)   | PSP_PAGE_DESCR(PSP_PAGE_DESCR_ACCESS_RW_ALL));

    _mmu_venable();

    _DCACHE_ENABLE();

    _ICACHE_ENABLE();

    _DCACHE_FLUSH();

    /* Initialise RTCS */

    rtcs_init();

//Below this other application code follows

BR?-

Nihad

0 Kudos

9 Views
Contributor III

Hi,

This post is continuation of my previous post.

I have attached a sample project which uses gfx-SRAM for KERNEL_DATA. Application hangs when I call

ipcfg_bind_staticip (BSP_DEFAULT_ENET_DEVICE, &ip_data);

from rtcs_init() function.

Application works when I move KERNEL_DATA to sysRAM1 space(#define DATA_BASE_ADDR_START    0x3f040000,  #define DATA_BASE_ADDR_END      0x3f07fff0 ).

Please let me know why it hangs when I use gfxRAM. Also suggest me how to fix this.

BR?_

NIhad

0 Kudos

9 Views
NXP Employee
NXP Employee

Thanks for the sample code. I have tried to set the project but there is no workspace file. If you can share it will be great. Anyways right now I'm little busy with other stuff so I will come back to you as soon as I have more bandwidth to focus on this.

0 Kudos

9 Views
Senior Contributor IV

Abdul,

It seems there's some more to change. I don't remember if you mentioned MQX version you are using, but in 4.0.1 just below the place in init_bsp.c where calls to _mmu_add_vregion() are made, there's another piece of code, which checks kernel data allocation address. For proper operation, I think, kernel data should be allocated in uncached area somewhere at __INTERNAL_SRAM_BASE+, which is defined as 0x3F000000 in twrvf65gs10_a5.h in MQX BSP sources. I think, that to make it working with the same performance like executing with data placed in cached DDR,  you should _mmu_add_vregion() to make gfxRAM cached, keep sysRAM uncached and allocate kernel data in sysRAM.

0 Kudos

9 Views
NXP Employee
NXP Employee

For enabling the cache in MQX, check next thread

https://community.freescale.com/message/335259#335259

Would you have a similar project that you can share where the issue of moving to gfxSRAM is reproducible, so I can try it at my side?

0 Kudos