How to access External memory using WEIM Burst Mode?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How to access External memory using WEIM Burst Mode?

2,176 Views
DeepakKukreja
Contributor III

Hi,

Current design

  • For my prototype I have an iMX51 processor interfaced to an FPGA.
  • For now I am making use of the External Memory interface for the interconnection.
  • Configured WEIM for 16 bit multiplexed, Asynchronous memory bus (WEIM).
  • In addition I have Windows Embedded Compact 7 running as the OS.

I have an application and a driver that read and write the words(16 bit) to the FPGA. Implementation is simple, I am simply writing to the address that falls in the WEIM CS0 at addresses that have specific meaning for both the iMX and the FPGA.

Everything is working fine except the throughput. I referenced few previous discussions and it has been attributed to bus latency within the SoC.

I have also been suggested that using Synchronous Burst mode would yield a better throughput.

But I have no idea how to make use of Synchronous Burst Mode from the software programmer's perspective. I can figure out the CS0 configuration registers, that is known to me. 

But how do I write code in C for triggering the synchronous burst mode is what I do not know how to get it to work.

Any clues, guidance, suggestions, pointers are very welcome.

Thanks in advance.

PS: ( I know iMX51 is an old device to be used in new designs, but we already have working boards with these, and once this works in concept we would be upgrading to iMX7 based boards)

Labels (3)
5 Replies

1,691 Views
DeepakKukreja
Contributor III

Just for the sake of anyone trying to do this in the future.

The bus throughput increased significantly by using memcpy() C library function. (I am using Windows CE 7). 

Here are the details for my setup

  • Win CE 7
  • iMX51
  • Asynchronous mode on WEIM CS0
  • WWSC set to 32 WEIM Clock cycles.

  1. When I was using a for loop to send out 16 bit words on the WEIM Bus:
    Async_for_loop.png
    So when I captured this waveform, I was sending out 4 words (16 Bit, DSZ).
    Time taken in the transfer is around 2 micro-secs, throughput 32 Mbps (32Mbps).
    There is a huge dead time of around 180 nanoSecs between two successive writes.This dead time was a pain because I could not find any configuration in the WEIM Config registers that could be used to change it. I tried reducing WWSC from 32 to 8, this reduce the CS assertion time, but the CS dead time remained stuck at 180 nano-secs.

    Because of this I was considering if by using assembly instructions STM instructions to increase the throughput. In the meanwhile I came across a thread on windows embedded compact forum, Alternate-of-memcpy, in which Bruce Eitman advises that the memcpy is already implemented to copy data in the fastest possible way. So I guessed the memcpy would already be doing what I was planning to using using assembly, I changed by implementation to use memcpy instead of the for() loop. The results have been very positive.

  2. WEIM bus timings after changing the implementation to use memcpy()
    Async_memcpy_4_words_wwsc_32.png
    Time taken for the 4 words transfer is 1 micro-sec, so the throughput is about 64Mbps (doubled, but still not good enough).

    On changing WWSC to 16 WEIM clock cycles (from 32), the throughput increased further:
    Async_memcpy_4_words_wwsc_16.png
    32 words(16 bit) transferred in 4.9 micro-seconds, throughput of 104 Mbps.

The throughput can be increased further by reducing the WWSC value. 

We are working on the Burst mode writes, but not through with it yet and we are expecting a much more increase in throughput based on Yuri and Igor's inputs. We are trying to achieve a throughput of around 350 to 400 Mbps.

Conclusion is, in windows CE 7 the memcpy can help increase your throughput on the WEIM bus, maybe before going into assembly you could give it a try.

1,691 Views
igorpadykov
NXP Employee
NXP Employee

Hi Deepak

please look at example provided on

https://community.nxp.com/message/426437?commentID=426437#comment-426437 

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply

1,691 Views
DeepakKukreja
Contributor III

Hi Igor, 

Thanks for directing us to the relevant discussion.

Assuming that I have configured the WEIM CS0 registers properly for Burst Mode access.. 

Can you please help with following questions?

  1. Is it correct to conclude that I will have to use inline assembly in my code, specifically use the LDM and STM instruction to initiate the BURST transfers?
  2. Is there some way that using C, I can generate these burst transactions on the WEIM Bus?
  3. One way for the write transfers to the external FPGA is getting the data from processor's memory (from Cache or DDR) into ARM's register and then doing a store to the WEIM CS0 addresses.
    Is there some way that the SDMA can be initiated to do these transfers freeing up the processor? 

Thanks

0 Kudos
Reply

1,691 Views
igorpadykov
NXP Employee
NXP Employee

Hi Deepak

yes for generating bursts sdma or arm code using LDM and STM instruction should be used.

Neon also may be used:

;/*    void transfer_eight_words_vld(int* dst, const int *src)      */    
                    vld1.64 {d0,d1,d2,d3}, [r1]
                    vst1.64 {d0,d1,d2,d3}, [r0]

Best regards
igor

1,691 Views
DeepakKukreja
Contributor III

Thanks a lot Igor, you have been a great help .. Regards

0 Kudos
Reply