16 bit subtract with absolute difference

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

16 bit subtract with absolute difference

1,298 Views
donw
Contributor IV

hi all

Does anyone have machine code (9S08) to get the absolute difference of two unsigned 16 bit variables?

I am looking for the FASTEST code....

 

don

Labels (1)
0 Kudos
7 Replies

667 Views
rocco
Senior Contributor II

Hi Don,

 

Assuming a general routine with all of the 16-bit variables in page zero and not in registers, it could most likely be this:

;
; Allocate the variables in the zero page.
;
        bsct
;
Variable1:      ds.w    1       ;minuend
;
Variable2:      ds.w    1       ;subtrahend
;
Variable3:      ds.w    1       ;difference
;
;
; Code section
;
        psct
;
        lda     Variable1+1     ;get low byte of minuend
        sub     Variable2+1     ;subtract low byte of subtrahend
        sta     Variable3+1     ;store low byte of difference
;
        lda     Variable1       ;get high byte of minuend
        sbc     Variable2       ;subtract high byte of subtrahend and carry
        sta     Variable3       ;store high byte of difference
;

 

Each instruction is 3 cycles, for a total of 18. With a 40mHz clock, it would execute in under a microsecond. It could be optimized for special cases, such as when the minuend or subtrahend is a constant, or a variable lives in a register.

 

I never write out math code like this, as I have a macro library to generate the code. PM me if you would like to see it. It does mixes of 8, 16, 24 and 32 bit numbers, based on a 32-bit psuedo-accumulator.

0 Kudos

667 Views
donw
Contributor IV

Hi Rocco

Thanks, but...

What I want is the ABSOLUTE difference value. (i.e. not a signed value)

I know ways to do it, but am interested in the FASTEST code method.

don

 

0 Kudos

667 Views
rocco
Senior Contributor II

Hi Don,

 

Sorry, I misunderstood. You need the absolute value of the difference.

 

I looked at my macro library, and it does it by first doing the above SUB16, followed by an ABS16 to get the absolute value. The ABS16 macro in turn does a test for negative and then invokes the NEG16 macro to make a negative value positive. It certainly won't be the fastest way to get there. The ABS16 function takes an additional 27 cycles if the result of the subtract was negative, so together they would take 45 cycles (2.25 uSec at a 20mHz bus-clock). If the result of he subtract is positive, however, the ABS16 function only takes 5 cycles, for a total of 23.

 

I think the above subtract can be massaged to include the absolute-value functionality within it, and should do somewhat better. I will have to look at it and do some cycle counting.

0 Kudos

667 Views
rocco
Senior Contributor II

OK, How's this:

 

If you don't mind the result being left in H:X instead of memory, there is this approach:

;; Absolute value of the difference between two 16-bit variables in page zero.;        lda     Variable1+1     ;get the low byte of the minuend        sub     Variable2+1     ;subtract the low byte of the subtrahend        tax                     ;put the low-byte result into X        lda     Variable1       ;get the high byte of the minuend        sbc     Variable2       ;subtract the high byte of the subtrahend        bpl     NoNeg           ;branch if the result is already positive;; We need to negate the result.;        comx                    ;compliment the low byte        coma                    ;compliment the high byte        psha                    ;push the high byte onto the stack        pulh                    ;  and then pull it into H        aix     #1              ;increment the complimented result        bra     exit            ;skip out now;;; We don't need to negate, so just put the result in H:X;NoNeg:  psha                    ;push the high byte onto the stack        pulh                    ;  and then pull it into H;; Done.;exit:

 It takes 21 cycles if the result is positive, and 28 cycles if it's negative. You might be able to optimize it further.

 

The result could be stored back in Variable3, as in the first example, by simply adding a "  sthx Variable3" at "exit:", with a cost of 4 additional cycles.

0 Kudos

667 Views
rocco
Senior Contributor II

And here is a variation.

 

It does save the final result into page-zero memory, and doesn't use the H register or the stack, at the expense of one cycle in the "positive" case.

;; Absolute value of the difference between two 16-bit variables in page zero.;        lda     Variable1+1     ;get the low byte of the minuend        sub     Variable2+1     ;subtract the low byte of the subtrahend        tax                     ;put the low-byte result into X        lda     Variable1       ;get the high byte of the minuend        sbc     Variable2       ;subtract the high byte of the subtrahend        bpl     done            ;branch if the result is already positive;; We need to negate the result.;        coma                    ;compliment the high byte        negx                    ;compliment and increment the low byte        bcc     done            ;skip out now if no carry        inca                    ;increment the high byte if carry;;; Store the result.;done:   sta     Variable3       ;store the high byte        stx     Variable3+1     ;  and then store the low byte

 This is 22 cycles in the positive case, but still 28 in the negative case.

 

I looked at doing a compare at the front end, so as to select one of two subtractions in order to alway get a positive result, but any 16-bit compare was pretty cycle-expensive. A single-byte compare is cheap, but you have a fifty-percent chance of being wrong anytime the two high-bytes were equal.

0 Kudos

667 Views
donw
Contributor IV

hi Rocco

Great code!

 To date I had just done the 16bit compare up front, as you considered.

I had not considered the incrementing effect of the NEGX  instruction. Very niffty!

don

 

0 Kudos

667 Views
rocco
Senior Contributor II

Hi Don,

 

I just realized that you can shave two more cycles off of the "negative" case of my second attempt by replacing

        bcc     done            ;skip out now if no carry
        inca                    ;increment the high byte if carry

 with

        adc     #0              ;add any carry into the high byte

 

Now the longer "negative" case will be 26 cycles, instead of 27 or 28 cycles.

 

I don't know how I missed it the first time. Sorry.

0 Kudos