Hi! I am testing our imx51 based device, which have FPGA connected to EIM CS1 signal. We need to test maximal performance on read and write, to and from FPGA.
Just now we are testing read from FPGA. Next will be Write to.
Our FPGA bus is 16 bit and is multiplexed. To try synchro burst mode on read, I just use u-boot code and standalone testing application.
My code is very simple, i just try to read 8 32-bit registers of ARM using one instruction ldm _address,{8 registers}.
Thing is working and with eim clock division 2 or 3 (weim clock is about 100 mhz), we can correctly read data from FPGA. But we see, that
this reading consists of 4 burst, where sequence {address, pulse, pulse, pulse, pulse } is for each. So we see, that every burst reads 4 16 bit words from our 16 bit bus, then there is a pause of approximately 5-6 clock, then again 4 16bit words burst...
So when I hoped to see long 16 pulse burst...I see four short four words burst, with quite long pauses.
Question - is it possibe to have 16 pulse burst for such instruction in our configuration of FPGA bus?
We have played with page size, burst length, etc fields in ChipSelect config registers but cannot make bursts longer.
I m quite novice in this tech, so it looks like I'm in some trap... Help pls.
this is code of reading and dumping data from FPGA
##################
asm volatile(
"push {r0,r1,r4-r11};" //save regs
//
"ldr r0,=0xB8000000;" //load FPGA start address - source
"ldr r1,=DataArray;" //load destinaton address, my static data array
//
"ldm r0, {r4-r11};" //load regs from FPGA -HERE WE HAVE 4 BURSTS, 4*16 bit words each
"stm r1, {r4-r11};" //store regs to mem
"pop {r0,r1,r4-r11};" //restore regs
);
dumpArray(DataArray,16); // here we are dumping array from memory to console
#################
Alex.
As for LDM / STM instructions : according to section 16.2.9 (Load multiple and store
multiple instructions) of Cortex-A8 Technical Reference Manual :
"The processor can load or store two 32-bit registers in each cycle."
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344h/ch16s02s09.html
Another possible solutions to get (back-to-back) burst accesses :
- enable ARM data cache (this feature has some specifics) ;
- NEON instructions ;
- DMA via the i.MX51 SDMA.
Have a great day,
Yuri
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
1. data cache was off, but toggling on does not help
2. dma - is the last chance of course
thanks for mentioning that Cortex a8 loads 2 regs per clock... may be this is the reason.
Kinda core tells to eim - give me 2 registers! and eim makes 4 beat burst...
then seems we cannot achieve expected 16 beat burst on such an instruction...???
hmm.
...deleted
Hi Alex,
shortly, memory address should be aligned according to burst size, that is
for 8 beat burst:
address starts with 0xxxx00,0xxxx10,0xxxx20,0xxxx50 e.t.c. according to PSZ setting.
Best regards
chip
Hi, Igor!
But here (from my code):
"ldr r0,=0xB8000000;" //load FPGA start address - source
is aligned to any reasonable burst! it is a start address of ChipSelect1, and our FPGA.
Also I set page size and burst size to 32 (we checked 16 also).
In my variant of code i expect to see {one address, 16 beat} burst, according to 8 destination 32 bit registers, and our 16bit FPGA bus.