Re: EIM Bus Performance Limit due to Internal Bus Latency

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Re: EIM Bus Performance Limit due to Internal Bus Latency

1,523 Views
aivchenko
Contributor II

I see a very similar issue on i.MX6SoloX CPU. The EIM bus is configured into 32-bit non-multiplexed asynchronous mode - ARD=SRD=0.

When I program registers according to the manual I'm getting around 160ns between successive assertions of /CS line.

I.e. setting RCSA/RCSN to zero, RADVA/RADVN, OEA/OEN and the functionally identical registers for write cycle. With RWSC/WWSC set to 10 I can see a correct read/write cycle. CSREC (CS cooldown) is also zero.

Under U-Boot a delay between successive reads or read/writes is ~160ns.

Under Linux (i.e. with MMU) it growth to 250ns or so.

The read or write cycle is simply a set of consecutive LDR commands from a bus address to a register so I can exclude cache/memory access from the timing

I also tried changing aforementioned parameters to one with a correct extension of the cycle.

The only time I was able to get CS back to back is when I programmed RWSC to be less than OEA+OEN or RCSA+RCSN, for example RWSC=5 while OEA=OEN=RCSA=RCSN=3 which is obviously in violation of the timing diagrams provided.

6 Replies

1,159 Views
aivchenko
Contributor II

I read chapter 43 - it's kind of confusing. I see that, for example EIM and QSPI are connected via PER_S switch and PCIe and Display are connected via Display switch. However I cannot map that into table 43-6 to find which parameters should I try to change. Names there do not match to the names in other tables.

Also please look at Figure 43-1 in NIC-301 Bus System.

1. Shouldn't there be A9 (dark blue) box? Is it connected directly to Main switch or via A9 switch like M4 does?

2. Why do you have two ENET peripherals connected to the Main?

3. Are M4 and Wakeup switches part of the Main switch or they are separate (i.e. via arrows)

Could you please provide me with an example code that changes QoS and tidemark parameters for NIC-301?

I only need to change QoS between QSPI and WEIM and between Disaplay and PCIe.

Thanks,

--Alex

0 Kudos

1,159 Views
aivchenko
Contributor II

OK, I understand that the core and EIM are connected via AMBA/AXI interface which is L2 Memory interface and has limited capability to transfer consecutive reads in one transaction. Then AXI needs to reconnect again like described as in

Intro to AXI Protocol: Understanding the AXI interface - SoC Design blog - SoC Design - Arm Communit... 

So it cannot transfer for more than 16 back-to-back reads from EIM thru AXI or 2 back-to-back writes.

Then it takes 16 clock cycles to re-establish connection on AXI bus

Is my understanding correct and this is the maximum we can gain from EIM bus?

Thanks,

--Alex

1,159 Views
Yuri
NXP Employee
NXP Employee

Hello,

   Yes, your understanding is correct, and - I am afraid - hardly it is possible

to achieve more throughput with the EIM.  

Regards,

Yuri.  

0 Kudos

1,159 Views
Yuri
NXP Employee
NXP Employee

Hello,

  the following may be helpful:

EIM performance issue on iMX6 

https://community.nxp.com/docs/DOC-106467 


Have a great day,
Yuri

0 Kudos

1,159 Views
aivchenko
Contributor II

I added a code to access EIM bus

                // copy bytes

                asm (

      "PUSH {r4-r10}\n\t"                 

      "MOV r2, #256-32\n\t"

    "loopme:\n\t"

      "MOV r1, #0x54000000\n\t"

      "LDMIA r1!, {r3 - r10}\n\t"

      "SUBS r2, r2, #32\n\t"

      "BGE loopme\n\t"

      "POP {r4-r10}"

                );

as you can see I'm using LDMIA command to do a block read from the bus into R3 thru R10

In this case I got timing on the bus which corresponds to correct CSREC (set to 1).

However in the "loopme" cycle I can see gaps between blocks of access to the bus. They are 167ns which are similar to when I perform EIM bus access in the loop using LDR command. To me it looks like every time access happens to the bus CPU negotiates reconfiguring of the crossbar.LeCroy3.jpg

My question is - how can I avoid these gaps. Does SoloX has control over internal crossbar configuration (given this is the source of the issue we are discussing) to keep it for the full cycle of reading 512 uint32s from the EIM bus? Would it affect other subsystems of the MPU?

Also, does NXP have a sample code for SDMA or NEON programming for i.MX6SoloX CPU to use in U-Boot environment?

Thx

0 Kudos

1,159 Views
Yuri
NXP Employee
NXP Employee

Hello,

  According to ARM documentation (section 8.1.2 Supported AXI transfers), mentioned in

 i.MX6 maximum EIM burst length and performance 

only restricted number of bytes in burst  can be implemented by core in back-to-back mode,

without pauses. This relates to NEON load / store instructions too.

http://infocenter.arm.com/help/topic/com.arm.doc.100511_0401_10_en/arm_cortexa9_trm_100511_0401_10_e... 

 

  Customers can try to modify default NIC settings (NIC is described in Chapter 43 [Network Interconnect Bus

System (NIC-301))] of the i.MX 6SX Reference Manual, but it is highly not recommended.  

  As for i.MX SDMA development - we do not have such tools. Sorry.

Regards,

Yuri.

0 Kudos