iMXRT1060 and HyperRAM transaction length

g_volokh · ‎04-10-2020

Hi All,

We are working with iMXRT1060 and connecting HyperRAM to the FlexSPI. Everything works properly.

We are using HyperRAM access via AHB.

As I understand, the transaction with HyperRAM should not be very long, the HyperRAM must make internal refresh between (or in the beginning) of transactions. As HyperRAM DS says, the recommended transaction length should not be longer than 4us. It means that with 166MHz frequency the transaction length should be about 1kbytes, not more.

The question: do we have opportunity to variate the transaction length via AHB? If yes, then how?

The idea is to make length much more (to increase speed) but not more than 4us.

Thanks in advance.

Regards,

George Volokh.

melissa_hunter · ‎04-28-2020

Hi George,

I think I found the problem. Your AHB RX buffer allocation doesn't leave any buffer for the DMA. You're assigning everything to the core instead. The DMA uses MSTRID 0x1 on this device, so it gets its own allocation for the AHB buffer (or you can not allocate the previous buffers and use buffer 3 as a single buffer shared between all the masters.

Here's the setup in the configuration descriptor in order to split the AHB Rx buffer between the core and DMA:

config.ahbConfig.buffer[0].enablePrefetch = true;
config.ahbConfig.buffer[0].masterIndex = 0;
config.ahbConfig.buffer[0].bufferSize = 0x200;
config.ahbConfig.buffer[1].enablePrefetch = true;
config.ahbConfig.buffer[1].masterIndex = 1;
config.ahbConfig.buffer[1].bufferSize = 0x200;

Give this a try, and it should allow the DMA to run correctly.

Regards,

Melissa

g_volokh · ‎04-29-2020

Dear Melissa,

Yes, you are absolutely correct. Everything works now.

First of all, let me say you a lot of thanks.

You are the only person in the NXP support team who can really help.

All other guys can only say "read the documentation" and everything will

be okay. We (I and they) only waste time without any success.

Sorry, it's emotions.

What about of AHB RX buffer (the register AHB RX Buffer Control

AHBRXBUFxCR0) . Really I didn't understand what it is the field MSTRID

in this register (and there is no detailed information about that in the

RM).

Now I guess that the each bus master is working with its corresponding

buffer, isn't it?

If yes, what does the field MSTRID mean?

I guess, that

MSTRID=0 - MCU Core,

MSTRID=1 - DMA,

MSTRID=2 - DCP,

MSTRID=3 - Others.

Am I correct?

The idea to increase the buffer size for particular working master is to

increase the reading speed.

For example, for DMA reading to OCRAM I get such benchmarks:

- AHB buffer=256bytes, reading speed = ~215MBytes,

- AHB buffer=1024bytes, reading speed = ~265MBytes.

The speed gain is 25%.

Please, clarify the field MSTRID and I believe the question will be closed.

But we are working to increase the writing speed too and meet some

strange behavior.

I am sure that I do something wrong but can't understand where.

Can I ask you this question too?

And a few words about HyperRAM.

I see that in iMXRT MCUs NXP makes accent on SDRAM as external RAM.

In my opinion, it's an outdated view. SDRAM works with 166 MHz speed,

needs more than 35 pins interface lines and big enough external RAM chip.

The potential transaction speed maybe around 300 MByte.

In other case, modern HyperRAM has 200 MHz clock rate, only 12 pins for

connection, potential transaction speed up to 400MB, and small enough

external RAM chip.

I believe, for embedded realization it's very good solution.

But... in today's iMXRT implementation the writing speed is not good

enough. The MCU needs a very small addition: an implementation of a

bigger write buffer for AHB bus and that's all.

We are hard working now on a very powerful and small new our device.

iMXRT family looks very powerful, but some strange implementations do

not allow us to fully use all the features of the processor.

Again, thanks a lot for your support.

Best regards,

George Volokh.

melissa_hunter · ‎04-29-2020

Hi George,

You've got the MSTRIDs right. For the core, DMA, and DCP you can specifically assign a portion of the prefetch buffer for that master. So you can split the prefetch buffer up into up to 4 chunks. Another option is to set the size to 0 for buffers 0-2. Buffer 3 ends up being used for any masters that were not explicitly assigned a buffer, and the buffer 3 size is any of the prefetch buffer that is left. So if you set the size to 0 for the other buffers, then buffer 3 gets used for any master (MSTRID value is ignored) and the size will also be the full 1KB (the size setting for buffer 3 is also ignored).

As you've already seen there is a performance boost associated with having a larger prefetch buffer (if you are doing mostly sequential accesses). If the core and DMA are mostly taking turns accessing the HyperRAM, then using buffer3 for both might be the best option. If the core and DMA are going to be accessing HyperRAM at the same time (they have to take turns so this would create interleaved accesses), then assigning specific amounts of the prefetch buffer to each might give you the best overall system performance even if the performance for each master is not maximized.

Yes, you can ask me your question about the write speed. You can add it to this thread. If you want to start it as a new topic, then that is an option too. I won't see it automatically in that case, but you can directly email me a link to the new post.

We haven't seen many customers using HyperRAM with RT1xxx devices...yet. On several of the devices there is only one FlexSPI instantiation, which typically gets used for boot, so on a lot of parts there isn't usually a FlexSPI available for HyperRAM. The SDRAM has been pretty popular, although over time I do think more customers who need external RAM might go to HyperRAM for the reasons you stated. We can't do anything about increasing the write buffer size for the devices we already have, but that can be considered as a feature for future devices.

Regards,

Melissa

melissa_hunter · ‎04-27-2020

Hi George,

I setup a quick test here. I don't have HyperRam, but I'm using an RT1050EVK which is setup to use HyperFlash by default. So it is at least the same HyperBus protocol, and as the problem now is with reads I can do those using the AHB interface with the flash device. I am using FlexSPI1 instead of FlexSPI2 (only one instantiation on RT1050).

So there are differences, but my setup should be pretty close to yours. In my quick test, I've been able to read from the HyperFlash using the DMA without errors. I'm going to compare register settings looking for any other differences that I can adjust to get closer to what you are doing.

Regards,

Melissa

g_volokh · ‎04-20-2020

Dear Mike,

As I have written, I am trying to switch off the cache at all before

making all transactions using the function

SCB_DisableDCache();

The result is absolutely the same.

Read data is incorrect.

Moreover, I don't use the DMA. Only MCU is working.

The difference is only in the source code:

It doesn't work

memcpy(fptr, buffer, txWatermark << 3);

It works fine

for (i = 0U; i < (txWatermark << 1); i++)

{

fptr[i] = *buffer++; // New code, fptr=0x7000000 for FlexSPI2

// base->TFDR[i] = *buffer++; // Old code

}

What's the reason?

Regards,

George Volokh.

iMXRT1060 and HyperRAM transaction length

iMXRT1060 and HyperRAM transaction length

i.MXRT