Question Regarding SRAM and Instruction Cache MPC5643L

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Question Regarding SRAM and Instruction Cache MPC5643L

910 Views
tektronix
Contributor II

I have a question regarding the MPC5643L architecture. I saw

the reference manuals and the e200z4 data sheet too but

could not find this information. There is very little information

regarding the SRAM.

 

I enabled both the core of MPC5643L with both core

accessing the SRAM in their particular area. According to the

architecture of the Chip the processor cores share the SRAM

bus and must see a performance degradation. This is perhaps

not the case, the time to read for example a certain number of

32-bit word is same as when only one core access the RAM

instead of two. This tells that there is something in between

that is taking care of it.

(Note: I am not running the code from the Flash, I am running

the code from SRAM directly. We are using TRK-USB-

MPC5643L processor).

 

My other question is regarding the Instruction Cache. I tried

enabling the instruction cache by writing 3 to spr register

1011. Then inorder to evaluate the performance of the cache i

called few instructions  in the code but I don't

see any difference in the timings with/without instruction

cache enabled/Disabled)

Note: (For this task also I directly download the code to the

SRAM and start executing it from there.)

 

 

If I am missing anything please let me know.

Labels (1)
0 Kudos
2 Replies

553 Views
trytohelp
NXP Employee
NXP Employee

Hi,

first of all we're very sorry for the delay of this feedback.

It seems this thread was out of our scope.

We're working to improve the community.

The development team has been contacted and below the info I found - associated to the SR# 1-1250781390.

+++++++++++++++++++

First of all I need to know what is your core configuration, Normally Leopard runs as Lock-step mode so the both cores execute same instruction unless user change the LSM mode to DPM mode. I hope that you are using LSM. Please let me know what mode you are executing? Check the system status register in SSCM module. It’s LSM bit show the mode.

  1. In Leopard two 64Kb SRAM arrays are present to address 128Kb SRAM. Two Address decoders present and one is access to lower half of SRAM. So when both cores read a location in SRAM only one address decoder selected and it allow to read the SRAM. Then there is a Read multiplexer it duplicate the read and send to both cores. So there is no penalty or no performance degradation. 

  1. Regarding the I-Cache if you runs on RAM you may not see the difference. I hope that you are running the system @120MHz? Normally the SRAM wait is define in MUDCR register. If the system frequency is <= 80MHz then 0 wait state else it should need to set as 1 wait state. If you are running the code in Flash and then you will see that I-Cache enable will give you better performance than disabled. As Flash need longer waits states so if the instructions are in cache the it speeded up. Also the performance depend on the code. In automotive there are many branch instruction in the code which change the flow etc and impact on performance.

What type of application are you working on?

To enable the I cache: example: user need to invalidate before using the cache

# invalidate and enable the instruction cache            

__icache_cfg:

  e_li r5, 0x2

  mtspr 1011,r5

  e_li r7, 0x4

  e_li r8, 0x2

  e_lwi r11, 0xFFFFFFFB

__icache_inv:

  mfspr r9, 1011

  and.  r10, r7, r9

  e_beq __icache_no_abort

  and.  r10, r11, r9

  mtspr 1011, r10

  e_b __icache_cfg

__icache_no_abort:

  and.  r10, r8, r9

  e_bne __icache_inv

  mfspr r5, 1011

  e_ori   r5, r5, 0x0001

se_isync

msync

  mtspr 1011, r5

Flash wait states need to be set in PFCRx register.

Frequency

Flash Wait State

B02_APC

B02_WWSC

B02_RWSC

<=120MHz

3 cycles

3

3

3

<=80 MHz

2 cycles

2

2

2

<=60 MHz

1 cycle

1

1

1

+++++++++++++++++++

Hope this will help you.

Pascal

0 Kudos

553 Views
tektronix
Contributor II

Hello Pascal


>>>>...Thanks for the reply. I was busy with some more experiments. Below are the answers to your questions. Please have a look and let me know

First of all I need to know what is your core configuration, Normally Leopard runs as Lock-step mode so the both cores execute same instruction unless user change the LSM mode to DPM mode. I hope that you are using LSM. Please let me know what mode you are executing?


>>>> I am running the processor in the Decoupled Parallel Mode. One more thing I want to ask is that in decoupled parallel mode access to each of the RAM controller is performed via its respective XBAR so that means that if both the CPU try to access the same area then we should see some degradation? Right?


Check the system status register in SSCM module. It’s LSM bit show the mode.

  1. In Leopard two 64Kb SRAM arrays are present to address 128Kb SRAM. Two Address decoders present and one is access to lower half of SRAM. So when both cores read a location in SRAM only one address decoder selected and it allow to read the SRAM. Then there is a Read multiplexer it duplicate the read and send to both cores. So there is no penalty or no performance degradation.

        >>>>>>>In DPM when both the core try to access the same RAM controller we must see degradation. Assuming Fixed Priority or Round Robin Mode of the Salve port. 



  1. Regarding the I-Cache if you runs on RAM you may not see the difference. I hope that you are running the system @120MHz? Normally the SRAM wait is define in MUDCR register. If the system frequency is <= 80MHz then 0 wait state else it should need to set as 1 wait state. If you are running the code in Flash and then you will see that I-Cache enable will give you better performance than disabled. As Flash need longer waits states so if the instructions are in cache the it speeded up. Also the performance depend on the code. In automotive there are many branch instruction in the code which change the flow etc and impact on performance.

What type of application are you working on?


I want to run the chip in DPM mode and measure bottlenecks

To enable the I cache: example: user need to invalidate before using the cache


>>>>> For I Cache I have done this thing the one you mentioned. By the way for both cores same SPR exist right? I mean each core has its own instruction cache that means that if i want to enable the instruction cache of respective core I have to call run same piece of code on both right?


# invalidate and enable the instruction cache           

__icache_cfg:

  e_li r5, 0x2

  mtspr 1011,r5

  e_li r7, 0x4

  e_li r8, 0x2

  e_lwi r11, 0xFFFFFFFB

__icache_inv:

  mfspr r9, 1011

  and.  r10, r7, r9

  e_beq __icache_no_abort

  and.  r10, r11, r9

  mtspr 1011, r10

  e_b __icache_cfg

__icache_no_abort:

  and.  r10, r8, r9

  e_bne __icache_inv

  mfspr r5, 1011

  e_ori   r5, r5, 0x0001

se_isync

msync

  mtspr 1011, r5

0 Kudos