Help for coding an equation on DSP

mr_max · ‎10-23-2012

Hello ! I'm new here.

I try to write this equation for my DSP : Y(n) = 1/2 (X(n) + X(n-1)) . After assembling and lunch the code into the processor, there is no signal at the output of my DAC.

Below a part of the source code :

Code language : Assembler

Microprocessor : DSP 56374

with :

Y(n) : Output signal (TxBuffBase)

1/2 : Coefb0

X(n) : actual sample (Xn)

...

move #$0,b ; Init X(n-1) for the first loop

AudioLoop

jclr #RightRx,x:LRFlag,*

bclr #RightRx,x:LRFlag

move x:RxBuffBase,a ; <- ADC left input

; move x:RxBuffBase+2,b ; <- ADC right input (not use for the moment ...)

move a,x:(r1) ;saving Xn for the next loop

move r1,x:Xn

move #coefb0,r0 ; coefb0 declare as constant

move x:(r0),x1

add b,a ; a=X(n)+X(n-1)

move a,x0

mpy x0,x1,a ; a=1/2((X(n)+X(n-1))

move x:(Xn),b ; register b become X(n-1)

move a,x:TxBuffBase ; -> DAC left output

; move b,x:TxBuffBase+1 ; -> DAC right output (note use)

jmp AudioLoop

...

So what wrong with my code ? Does someone can help me pleas ?

note 1 : I'm a beginner.

note 2 : The complete source code is in the attachment.

Original Attachment has been moved to: LPFn1_cs.asm.zip

rocco · ‎10-24-2012

Hi Maxime,

I can see that your program has two components: the I/O and the math. I did not analyze your code to determine where your problem is, as I thought you needed to rewrite the math portion. As far as the I/O portion, I can address that later.

I can see that you are trying to implement an "exponential averager" with a coefficient of a=.5

Here is some code that I have for an exponential averager:

;

; Use the TachVoltage from the ESAI and filter it into x:TachoVelocity.

; We use the exponential-averager formula: Y(new) = a*X + (1-a)*Y(old)

; Where: X is the new sample, x:ESAI_TachVoltage

; Y(old) is the previous average, x:TachoVelocity

; and Y(new) is the new average.

;

TachFilter_a: equ .5 ;'a' modified for Maxime to .5

;

move x:ESAI_TachVoltage,X0 ;get tacho voltage read from the ESAI DAC

move x:TachoVelocity,Y0 ;get the previous value of the velocity

mpy #TachFilter_a,X0,A ; A = (a) * X

mac #1-TachFilter_a,Y0,A ; + (1-a) * Y(old)

nop ;pipeline stall

move A,x:TachoVelocity ;save filtered tacho voltage as motor velocity

;

A little explanation:

The actual averaging is done in only two instructions: a 'mpy' and a 'mac'.

Where you need to add two values, you can often use the "multiply-and-accumulate" instruction instead of an add. Much of the power of DSPs come from the 'mac' instruction, as many algorithms are based on adding successive products together.

Notice that in my above code, you can change the time-constant of the filter by changing the "TachFilter_a: equ .5" line. The smaller 'a' is, the more filtering you have. Since the sum of 'a' and '1-a' is 1, the filter has unity gain.

You should only use the address registers R0-R7, N0-N7 and M0-M7 when working with addresses. Remember that the ALU and the AGU are two independent computational units, with very different purposes.

The I/O is an independent issue.

View solution in original post

rocco · ‎10-24-2012

Hi Maxime,

I can see that your program has two components: the I/O and the math. I did not analyze your code to determine where your problem is, as I thought you needed to rewrite the math portion. As far as the I/O portion, I can address that later.

I can see that you are trying to implement an "exponential averager" with a coefficient of a=.5

Here is some code that I have for an exponential averager:

;

; Use the TachVoltage from the ESAI and filter it into x:TachoVelocity.

; We use the exponential-averager formula: Y(new) = a*X + (1-a)*Y(old)

; Where: X is the new sample, x:ESAI_TachVoltage

; Y(old) is the previous average, x:TachoVelocity

; and Y(new) is the new average.

;

TachFilter_a: equ .5 ;'a' modified for Maxime to .5

;

move x:ESAI_TachVoltage,X0 ;get tacho voltage read from the ESAI DAC

move x:TachoVelocity,Y0 ;get the previous value of the velocity

mpy #TachFilter_a,X0,A ; A = (a) * X

mac #1-TachFilter_a,Y0,A ; + (1-a) * Y(old)

nop ;pipeline stall

move A,x:TachoVelocity ;save filtered tacho voltage as motor velocity

;

A little explanation:

The actual averaging is done in only two instructions: a 'mpy' and a 'mac'.

Where you need to add two values, you can often use the "multiply-and-accumulate" instruction instead of an add. Much of the power of DSPs come from the 'mac' instruction, as many algorithms are based on adding successive products together.

Notice that in my above code, you can change the time-constant of the filter by changing the "TachFilter_a: equ .5" line. The smaller 'a' is, the more filtering you have. Since the sum of 'a' and '1-a' is 1, the filter has unity gain.

You should only use the address registers R0-R7, N0-N7 and M0-M7 when working with addresses. Remember that the ALU and the AGU are two independent computational units, with very different purposes.

The I/O is an independent issue.

mr_max · ‎10-25-2012

Great ! It's work. I've made a test by sweeping a sinus wave and the output signal has been filtered. It isn't the best filter I've seen but it's a good start for me ! By the way, your equation is not exactly same of mine because X(n-1) isn't the previous average (like Y(old)) but the previous sample from the ADC. But thank you anyway. I understand a little more now what is the purpose of the MAC operation.

I've a question : Can I use MAC operation one after the other ?

For example if I want resolve Y(n) = b0.X(n) + b1.X(n-1) +b2.X(n-2)

The code will be :

coefb0: equ 0.5

coefb1: equ 0.4

coefb2: equ 0.3

;This values of coefficients are for the exemple

move x:RxBuffBase,X0 ;Take sample from ADC to X0 register

move X0,x:Xn1 ;X(n) -> X(n-1) ; I'm not sure that is a good data transfert. Is it ?

move Xn1,x:Xn2 ;X(n-1)-> X(n-2); Same thing, not sure ...

mpy #coefb0,x0,a ;a=b0*X(n)

mac #coefb1,Xn1,a ; + b1*X(n-1)

mac #coefb2,Xn2,a ; + b2*X(n-2)

nop ; Why we should do a non-operation here ?? What is a Pipeline Stall ??

move a,x:TxBuffBase ; TxBuffBase = intput DAC.

; Note : Xn, Xn1 and Xn2 have been declared into the X date memory.

I didn't test the code. I will do it later this day. Do you think is good like this ?

rocco · ‎10-25-2012

Hi Maxime,

Your right, I misread your code. It looked so much like the exponential-averager that my brain told me it was. In reality, it was a two-tap FIR filter. Now I understand why the coefficients were .5

Your last question first: What is a pipeline stall?

Typically, the DSP starts an new instruction every cycle, but the execution is pipelined, so it typically takes 7 cycles to complete. In effect, 7 instructions are at different stages of execution at any point in time. The actual multiply in the ALU takes 2 cycles, so the result cannot be read-out of the accumulator until one cycle after the multiply is executed. I use a NOP for that one cycle delay, but if the NOP is not there, the DSP is supposed to insert a "pipeline-stall" auto-magically. However, my assembler flags an error when it detects a pipeline-stall, so I need to explicitly delay that one cycle.

As for the MAC instruction:

I see that you are expanding your FIR filter to three taps, and you are heading in the correct direction. But if you should expand to, say, 100 taps, you can see how the code can get tedious.

The "repeat" instruction, coupled with the MAC instruction and the Address Generation Unit (AGU) allows you to build a FIR filter of any size with just a few instructions. Here is a sample:

;

; Set N to the number of taps you would like to have.

;

N: equ 32 ;for a 32 tap FIR filter

;

; This code only needs to be executed once to initialize the AGU registers.

;

move #CoefficientTable,R0 ;FIR filter coefficient table in x-memory

move #SampleTable,R4 ;samples-table to be filtered in y-memory

move #N-1,M4 ;set modulus register for N taps

move M4,M0 ;both modulus registers are the same

.

;

; This code gets executed for each new sample.

;

movep y:input,y:(R4) ;put sample in table over oldest sample

clr A x:(R0)+,X0 y:(R4)-,Y0 ;get 1st sample and coefficent

rep #N-1 ;do 'mac' for all taps except the last tap

mac X0,Y0,A x:(r0)+,X0 y:(r4)-,Y0 ;get next sample and coefficient

macr X0,Y0,A (r4)+ ;mac final tap, round, inc sample-address

movep a,y:output ;ship the filtered value to the outside

;

Notice that the address registers take care of themselves, wrapping around from end to beginning when they need to. Notice also that the 'mac' instruction inside the repeat-loop not only multiplies and adds, but also fetches the next sample and coefficient for the following iteration. That means 1 cycle per tap. In the DSP that I'm using, it means I can execute a tap in 5 nanoseconds, or a 100 tap FIR filter in half a microsecond.

mr_max · ‎10-27-2012

Waw I'm realy impressed by the power of parallel operation. Thank you Mark for your knowledge. Your tips will be very helpful for my second grade bachelor ! :smileyhappy:

Two last question :

What do you mean by set modulus register for N taps ?

And why did you copy M4 into M0 registers ?

rocco · ‎10-27-2012

Hi Maxime,

Two last question :

What do you mean by set modulus register for N taps ?

And why did you copy M4 into M0 registers ?

Well, thats the way modulo-addressing works - the modulus is handled in hardware, as determined by the respective Mn register. And of course the number of samples is equal to the number of coefficients, so both modulo registers need to be the same.

You may want to re-read the chapter on the Address Generation Unit. I had to read the DSP family-manual a couple of times before I understood how everything worked.

Help for coding an equation on DSP

Help for coding an equation on DSP

StarCore DSPs