I am using the EFCOP in the DSP56371 for FIR filter processing. I cannot find any mention in the user manuals for the EFCOP of its processing speed other than one MAC/cycle. There is no indication of setup times etc.
Can anyone help with this information ?
Many thanks, John.
Hi John,
I'm afraid there is no simple answer.
Whereas the EFCOP does execute 1 MAC per cycle when it is running (for a max of 180 megaMacs on the 671) the issue is what does it takes to keep it running. And that depends on what you are trying to do.
If you only need to run one filter, then the setup is done once and you only need to shovel data in and out. Even the time to shovel can vary, depending if you are using DMA, interrupts or polling, and where the data is from and going.
But if you need to run multiple filters, as I tried to do, then the setup and re-setup can consume a lot of cycles, and the EFCOP does not run often enough to make it worthwhile. Also, if your filters are not long you can spend more time reconfiguring the EFCOP than it takes to do the filter in the ALU. For those reasons, I used a dual core DSP, and I have the other core do the filtering. The EFCOP is most efficient with one long filter.
There are example code listings for setting-up and running the EFCOP for polling, DMA and interrupts in the application note "AN2691, Applied Matrix Multiplication With the DSP563xx Enhanced Filter Coprocessor (EFCOP).pdf". Although it is intended for matrix math, it did help me figure out what code was required and how long it would take to setup the EFCOP.
http://www.freescale.com/files/dsp/doc/app_note/AN2691.pdf
mark
Hi Mark,
many thanks for your reply.
I have been getting good results with the EFCOP and my application requires no reconfiguration - it just runs it flat out.
Essentially I am running a four channel filter with new data every 4096 cycles (fed from a timer interrupt routine). Since the input fifo is 4 words deep I just write that (no polling) and the output is handled by dma. The question then is how long can each filter be?
Well, 1016 taps does not quite work but 1014 taps does, suggesting an overhead of about 8 or 9 cycles per data word.
What would be nice to know is the safety margin in spare cycles so that issues such as interrupt latency can be accounted for and continued correct operation assured.
Many thanks, John