Not getting the Max DMA Transfer Rate in Vybrid processor

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Not getting the Max DMA Transfer Rate in Vybrid processor

1,364 次查看
sathishkumarsig
Contributor II

I have Faraday EVB. In which, I am trying to find out the eDMA peak transfer rates (Mbytes/sec).

I have conducted the test by transferring 256KB (i.e Minor Loop Count= 262144, Major Loop Count=1)of data from one location to another.

DMA Channel used = 0th Channel

SADDR(source address) = 0x3f000000(IRAM)

DADDR(destination address) = 0x3f040000(IRAM)

SOFF = 32

DOFF =32

MinorLoopCount = 262144

MajorLoopCount =1

LastSourceAdjustment =0;

LastDestinationAdjustment=0;

MinorLoopOffsetDest =0;

MinorLoopOffsetSrc=0;

MinorLoopOffset=0;

SSIZE =5

DSIZE =5

BWC = 3

The SYSCLK is selected as PLL1 PFD3 for 396MHz and ARM_DIV = 1,BUS_DIV = 3.

As per the datasheet "Vybrid_Reference_Manual_F_Series_-_Rev_3.pdf" in Table 22-1636. for 133.3 MHz frequency transferring data from SRAM to SRAM, I supposed to get 533.3MBytes/sec. But I am getting around 150MB/s

Can anyone support to identify what went wrong?  Thanks in advance.

标签 (1)
标记 (1)
5 回复数

970 次查看
ioseph_martinez
NXP Employee
NXP Employee

Hello Satishkumar,

The reason you can't get theoretical performance is because the system has some latency on each access made to the memories. So 533MB/s 64bit@133MHz would only be possible on a system with no latency. If the DMA would use AXI would had a better pefromance due the ability to post outstanding transactions but since the DMA uses AHB to match realtime/predictability requirements, it will wait until the transaction is done before posting any other r/w.

There are two transactions on a DMA transfer, read and write and each will be subject to some latency. There is latency due passing through the NIC and latency of the slave peripheral itself (in this case, internal RAM) generally speaking it would be:

Latency: Master NIC lat + Slave NIC lat + Buffering NIC lat + Slave lat

NIC eDMA max latencies are: read = 2, write = 4

NIC SRAM max Latencies are: read = 4, write = 4

Some additional latency due buffering since AHB to AXI conversion.

Slave latency for SRAM is zero.

And this are the measurements:

MajorLoop

MinorLoop

SOFF/DOFF

SSIZE/DSIZE

BWC

Time in us

MB/Sec

rw # transactions

Clk ticks @ 133MHz

Clks/trans

1

16384 (16KB)

32

5 (32bytes)

0

101

162

512

13436

       26.2

1

16384 (16KB)

8

3 (64bit)

0

218

75

2048

28784

14

1

16384 (16KB)

4

2 (32bit)

0

435

37

4096

57448

14

1

16384 (16KB)

2

1 (16bit)

0

869

19

8192

114800

14

1

16384 (16KB)

1

0 (8bit)

0

1738

  1. 9.4

16384

229480

14

If you look at the measurements they are in line with the latencies, each transaction read write is 14 cycles. Which is somewhat aligned to the read, write (2+4) + (4 +4) latencies from the NIC.

All transactions required the same amount of cycles regardless the SSIZE/DSIZE value (except for 32bytes)

I believe in the case of 32bytes we have a penalty on the buffering and that is why we get additional cycles per r/w.

The other possibility is the NIC latency is a bit lower but we are having anyway buffering latency added to the transaction due the AHB to AXI translation.

The conclusion is not possible to achieve a 533MB/s with the eDMA. And the results make sense with the expected latency of such system. If you need to increment performance, you can use another DMA (eDMA1 for example) in parallel or you can use the CPU or GPU which would have way more efficient accesses due AXI protocol.

970 次查看
alejandrolozan1
NXP Employee
NXP Employee

Hi,

Would you be nice enough to share the entire project?

I will be glad to check it out and see what I can do.

Best Regards,

Alejandro

0 项奖励
回复

970 次查看
karina_valencia
NXP Apps Support
NXP Apps Support

Hi Sigamani,

can you provide  the information requested previously?

0 项奖励
回复

970 次查看
sathishkumarsig
Contributor II

Hi, I am communicating with Ioseph regarding this. Thanks for your replies.

0 项奖励
回复

970 次查看
ioseph_martinez
NXP Employee
NXP Employee

Hi, just trying to confirm the calculated latency numbers with design. They were out for a couple of weeks. Hopefully will get some answer this week.

0 项奖励
回复