Questions about DMA controller on Kinetis L Series

jrychter · ‎11-17-2013

Hi, I have two questions about the DMA engine on the KL25 — I'm hoping someone can help:

1. I would like to perform transfers from an array containing bytes to a peripheral register (TPM1_C0V). I thought this is exactly what SSIZE and DSIZE are for, so I set SSIZE to 8 bits and DSIZE to 32 bits:

DMA_DCR0 = DMA_DCR_EINT_MASK | DMA_DCR_ERQ_MASK | DMA_DCR_CS_MASK | DMA_DCR_SSIZE(0b01) | DMA_DCR_SINC_MASK | DMA_DCR_DSIZE(0x00);

I expected to be able to access single bytes from the source array, get them 0-padded to 32 bits by the DMA engine and get 32-bit writes to the target register. But what I actually get are 32-bit reads and 32-bit writes, e.g. on the first DMA request 4 bytes are copied from source to destination instead of one. The timer then truncates the write to 16-bits.

I've read and re-read the DMA section in the reference manual dozens of times and I can't see how this could be the intended behavior. If it is the expected behavior, how do I convert from source bytes to word writes?

2. I tried to implement a DMA ping-pong scheme, which proved unexpectedly hard. There is no half-way interrupt on the L series DMA engine. I wanted to use the other DMA channels in linked mode, to reconfigure things so that channels are switched. But then I found out that DMAMUX documentation says "Setting multiple CHCFG registers with the same Source value will result in unpredictable behavior". That is exactly what I need to do: I have one peripheral issuing requests, and wanted to have two channels, only one of which would have the ERQ bit set.

Having read that, I then tried to use *two* DMA channels for DMAMUX configuration changes. So the procedure would be roughly:

* DMA0 runs, upon completion links to DMA3,

* DMA3 writes a 0 to DMAMUX channel 0 and links to DMA4,

* DMA4 enables DMAMUX channel 1 with the appropriate source.

* DMA1 runs, upon completion links to DMA3 (which has hopefully been reconfigured in the meantime).

I tried this and it works. But I am still unclear on how the DMAMUX registers need to be accessed. Do I need to separately write the source, and then the enable bit, or is just a single write enough? Similarly for switching off: is a single 0-write enough?

Also, I'm wondering about the "unpredictable behavior" mentioned above. Is that really the case when only one of the channels is enabled for peripheral requests?

More generally, is there something I'm missing? A better way to have a ping-pong DMA scheme where one buffer can be prepared while the other gets sent?

jeremyzhou · ‎11-17-2013

Hi Jan,

To your first question, I suggest that you can referece LQRUG_tpm_ex2 in the FRDMKL25 demo code which you through the link as below shows to download.

About your sceond question, I think if you can gurantere that only one of the channels is enabled for particular peripheral request then it will not cause unpredictable behavior.

Best regards,

Ping

KL25_SC.exe:http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=FRDM-KL25Z&fpsp=1&tab=Design_Tools_Ta...

jrychter · ‎11-18-2013

Thank you for the example. But if I understand it correctly, it issues 16-bit to 16-bit requests (SSIZE=DSIZE=0x02) from a circular buffer of 64 16-bit values. What I was trying to do was read 8-bit values and store them to a 32-bit register (I see now that it can also take 16-bit values). Is this possible with DMA? If not, what are the SSIZE/DSIZE settings really for?

In my case the problem is that I need to load the values very quickly (not later than every 1.3µs, that's 26 cycles at 20MHz!), and cannot afford to waste memory, I have to generate data on the fly. Every bit of input data requires a timer write, so even for 1kB of data expanding every bit to 16-bits is not manageable. An 8-bit buffer would be OK if I could use a ping-pong scheme.

jeremyzhou · ‎11-19-2013

Hi Jan,

It's ok that the size of source and destination is different. And you will know how to configure the register of DMA after you go throuh the 23.4.4 Advanced Data Transfer Controls: Auto-Alignment in the KL25's datasheet.

For you problem, How do you think the method that increase the frequency of system clock to reduce the time cost?

Best regards,

Ping

jrychter · ‎11-19-2013

Ping Zhou wrote:

It's ok that the size of source and destination is different. And you will know how to configure the register of DMA after you go throuh the 23.4.4 Advanced Data Transfer Controls: Auto-Alignment in the KL25's datasheet.

Quoting from section 23.4.4: "Typically, auto-alignment for DMA transfers applies for transfers of large blocks of data. As a result, it does not apply for peripheral-initiated cycle-steal transfers." Well, in my case what I do is peripheral-initiated cycle-steal transfers.

Reading from page 358: "If auto-alignment is enabled, the appropriate address register increments, regardless of DINC or SINC". In my case DINC is false, the destination is a fixed TPM peripheral register.

From what I understand about auto-alignment, it is used to optimize transfer sizes. Are you sure it is relevant in my case?

My question #1 was about doing this procedure with DMA:

1) read a byte from memory

2) zero-extend that byte to 32-bits (a word)

3) write the word to the TPM peripheral C0V register, which only accepts word writes.

I thought the combination of flags that I used was exactly for this purpose, but it seems I am wrong. Is it possible to perform the procedure above using DMA?

As to clock frequency, yes, of course I will increase it, but it will not help with peripheral access times and I don't know how much it helps with interrupt latency.

martynhunt · ‎11-19-2013

Hi Jan,

You are correct. Auto-Align does not apply to peripheral-initiated cycle-steal transfers. A potential solution to your problem would be 16-bit SSIZE & DSIZE. The TPM0_CnV registers will take a 16-bit write. You could pad the first byte of each buffer entry with zeros, and write the 16-bit value to the C0V register.

Please let me know if this solution works for you.

Best regards,

Martyn

jrychter · ‎11-25-2013

It might work, but it is wasteful — it means I have to prepare twice as much DMA data. I'm not sure I'll have enough time to do that. And I definitely need the ping-pong scheme to work, then, as there is no way I will be able to store my entire data in RAM (with a single bit expanding into two bytes).

I'm still curious: what is SSIZE really for? If it's about alignment only, shouldn't it be called SALIGN?

Alice_Yang · ‎09-29-2014

hello

I think the "SSIZE" and "DSIZE" although are the auto-align basis , it still really for the data size of the source and destination bus cycle for the DMA controller.

If both equal or disable auto-align (sw trigger DMA), there isn't auto-align.

Best Regards

Alice

Questions about DMA controller on Kinetis L Series

Questions about DMA controller on Kinetis L Series

Kinetis L Series MCUs