hi all
Does anyone have machine code (9S08) to get the absolute difference of two unsigned 16 bit variables?
I am looking for the FASTEST code....
don
Hi Don,
Assuming a general routine with all of the 16-bit variables in page zero and not in registers, it could most likely be this:
; ; Allocate the variables in the zero page. ; bsct ; Variable1: ds.w 1 ;minuend ; Variable2: ds.w 1 ;subtrahend ; Variable3: ds.w 1 ;difference ; ; ; Code section ; psct ; lda Variable1+1 ;get low byte of minuend sub Variable2+1 ;subtract low byte of subtrahend sta Variable3+1 ;store low byte of difference ; lda Variable1 ;get high byte of minuend sbc Variable2 ;subtract high byte of subtrahend and carry sta Variable3 ;store high byte of difference ;
Each instruction is 3 cycles, for a total of 18. With a 40mHz clock, it would execute in under a microsecond. It could be optimized for special cases, such as when the minuend or subtrahend is a constant, or a variable lives in a register.
I never write out math code like this, as I have a macro library to generate the code. PM me if you would like to see it. It does mixes of 8, 16, 24 and 32 bit numbers, based on a 32-bit psuedo-accumulator.
Hi Don,
Sorry, I misunderstood. You need the absolute value of the difference.
I looked at my macro library, and it does it by first doing the above SUB16, followed by an ABS16 to get the absolute value. The ABS16 macro in turn does a test for negative and then invokes the NEG16 macro to make a negative value positive. It certainly won't be the fastest way to get there. The ABS16 function takes an additional 27 cycles if the result of the subtract was negative, so together they would take 45 cycles (2.25 uSec at a 20mHz bus-clock). If the result of he subtract is positive, however, the ABS16 function only takes 5 cycles, for a total of 23.
I think the above subtract can be massaged to include the absolute-value functionality within it, and should do somewhat better. I will have to look at it and do some cycle counting.
OK, How's this:
If you don't mind the result being left in H:X instead of memory, there is this approach:
;; Absolute value of the difference between two 16-bit variables in page zero.; lda Variable1+1 ;get the low byte of the minuend sub Variable2+1 ;subtract the low byte of the subtrahend tax ;put the low-byte result into X lda Variable1 ;get the high byte of the minuend sbc Variable2 ;subtract the high byte of the subtrahend bpl NoNeg ;branch if the result is already positive;; We need to negate the result.; comx ;compliment the low byte coma ;compliment the high byte psha ;push the high byte onto the stack pulh ; and then pull it into H aix #1 ;increment the complimented result bra exit ;skip out now;;; We don't need to negate, so just put the result in H:X;NoNeg: psha ;push the high byte onto the stack pulh ; and then pull it into H;; Done.;exit:
It takes 21 cycles if the result is positive, and 28 cycles if it's negative. You might be able to optimize it further.
The result could be stored back in Variable3, as in the first example, by simply adding a " sthx Variable3" at "exit:", with a cost of 4 additional cycles.
And here is a variation.
It does save the final result into page-zero memory, and doesn't use the H register or the stack, at the expense of one cycle in the "positive" case.
;; Absolute value of the difference between two 16-bit variables in page zero.; lda Variable1+1 ;get the low byte of the minuend sub Variable2+1 ;subtract the low byte of the subtrahend tax ;put the low-byte result into X lda Variable1 ;get the high byte of the minuend sbc Variable2 ;subtract the high byte of the subtrahend bpl done ;branch if the result is already positive;; We need to negate the result.; coma ;compliment the high byte negx ;compliment and increment the low byte bcc done ;skip out now if no carry inca ;increment the high byte if carry;;; Store the result.;done: sta Variable3 ;store the high byte stx Variable3+1 ; and then store the low byte
This is 22 cycles in the positive case, but still 28 in the negative case.
I looked at doing a compare at the front end, so as to select one of two subtractions in order to alway get a positive result, but any 16-bit compare was pretty cycle-expensive. A single-byte compare is cheap, but you have a fifty-percent chance of being wrong anytime the two high-bytes were equal.
Hi Don,
I just realized that you can shave two more cycles off of the "negative" case of my second attempt by replacing
bcc done ;skip out now if no carry inca ;increment the high byte if carry
with
adc #0 ;add any carry into the high byte
Now the longer "negative" case will be 26 cycles, instead of 27 or 28 cycles.
I don't know how I missed it the first time. Sorry.