How to implement  4byte divide 2byte divide fasty?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How to implement  4byte divide 2byte divide fasty?

1,729 Views
CASEYKEVIN
Contributor I

in my application,   4 bytes / 2 byte is need  as following,

dword     a;               /*   a  is type of unsigned long , 4 bytes  */

word       b;               /*   b is type of  unsigned integer, 2 bytes  */

word      c;

 

c =  a /b;

 

if I code it  in  C ,   c =a /b take more than 200 CPU cycles, it is too long for my application, so  i consider implimenting it  in assembly  language,  my CPU is HCS08, and only have one division instruction: 

    DIV                             /*    A<--(H:A) /(X);  H<-- remainder 6 CPU cycles , 16-bit by 8 bit divide instructions  */

  

how to implement  a  4 bytes  variable divide 2 bytes variable  in assembly  fastly?   thanks .

Labels (1)
0 Kudos
4 Replies

889 Views
Obetz
Contributor III

Hello CASEYKEVIN,

 

you won't get faster results from a 9S08 CPU, more likely much slower under certain conditions (numbers).

 

Look at the sources of the compiler's division routine.

 

Use a better CPU (Coldfire) or try to solve the task with a multiplication.

 

Oliver

0 Kudos

889 Views
bigmac
Specialist III

Hello,

 

If you are currently achieving the 32/16 integer division in about 200 cycles, this would seem quite fast.  Normally I would expect about 10 times this amount, even if written directly in assembly code.  It will be interesting to see what the execution time of the two code snippets turns out to be.

 

The hardware divide for the HCS08 handles only an 8-bit divisor, and this is the limiting factor in its use - a 32/8 bit division would give fast code.  Increasing the size of the divisor to 16 bits necessarily results in a slower software division process.

 

To achieve substantially faster calculation would require an alternative MCU containing hardware to support a 16-bit divisor - probably a 16-bit or 32-bit device.

 

Regards,

Mac

0 Kudos

889 Views
rocco
Senior Contributor II

And here is an old routine of mine, originally written for the 68HC05.

 

It is a 24bit divided by 16bit, but easily expanded to 32bit divided by 16bit. It should also be optimized for the S08.

 

However, I'm not sure either of these two routines (Don's or mine) will do better than 200 cycles.

;;;; Divide 24 by 16;; Divides a 24 bit, unsigned integer by a 16 bit unsigned integer.; Enter with the dividend in the Psuedo-Accumulator and the divisor; in the X:A register.  Exits with the 24 bit Quotient in the P.Acc.; and the 16 bit remainder in X:A.;Div24x16: STXA .MULT.  ;put the divisor someplace safe ST24 TEMP  ;move divedend to TEMP CLR24 ,,  ;zero the low 24 bits ldx #24  ;number of times through the loop;; Main Loop.; Rotate dividend into A, one bit at a time,; and check if a subtract is needed.;; shift 32 bit pseudo-accumulator around left one position.;d24_1: asl TEMP+2  ;start with byte 0 rol TEMP+1  ;into byte 1 rol TEMP  ;and into byte 2 rol ACCUM0  ;then around into byte 0 of p-acc rol ACCUM1  ;and finaly into byte 1 of p-acc; bcs d24_2  ;do a subtract if hi-bit went into carry CMP16 .MULT.  ;is it worth a subtract? bcs d24_3  ;skip if no subtract needed here;    ;leave a zero in lo-bit of quotient;; Do the subtract and put a 1 into lo-bit of quotient.;d24_2: SUB16 .MULT.  ;subtract bset 0,TEMP+2 ;set new  lowest bit in quotient;; decrement the loop count and continue if not zero;d24_3: decx   ;decrement loop counter bne d24_1  ;and loop until all bits are done;; all bits are done.  Quotient is in TEMP:TEMP+1:TEMP+2; and remainder is in ACCUM1:0.  we move them.; lda TEMP  ;get high byte of quotient sta ACCUM2  ;put it where it belongs ldx ACCUM1  ;put high byte of remainder in X lda TEMP+1  ;get mid byte of quotient sta ACCUM1  ;put it where it belongs lda ACCUM0  ;get low byte of remainder sta .MULT.  ;put aside lda TEMP+2  ;get low byte of quotient sta ACCUM0  ;put where belongs lda .MULT.  ;get last of remainder back rts   ;return with answers;;

 

 

0 Kudos

889 Views
donw
Contributor IV

here is MC68xx   uP code for 4byte /4 byte, I have used this for years.

You can edit it for 2 byte dividend.

You could probably make it faster by using stack indexing...

 

 

*NAME:BIGDIV
*DESC:       32BIT DIVIDE BY 32BIT
* @DIVBUF bytes
*          0...4 = DIVISOR.
*          5...9 = DIVIDEND
*         10..14 = RESULT
* CHECKS FOR /0 , CY SET IF ERROR
CLRDIV:                                ;call first to clear ram area
       LDHX     #DIVBUF
CLRDIV10
       CLR     ,X
       AIX      #1
       CPHX     #(DIVBUF+14T)      
       BNE      CLRDIV10
       RTS
DIVERR
       SEC                 ; SET CY IF CANNOT DIVIDE
       CLR     14T,X
       CLR     13T,X
       RTS
;
BIGDIV:
       LDHX     #DIVBUF      ;   5 BYTE DIV   POINT TO MSD
BIGD10:
       CLR     10T,X           ; RESULT MSD
       CLR     11T,X          ;
       CLR     12T,X
       CLR     13T,X
       CLR     14T,X
       LDA     8,X            ; CHECK IF ZERO
       ORA     7,X
       ORA     9,X
       ORA     6,X
       ORA     5,X
       BEQ     DIVERR
       LDA     4,X            ; CHK IF DIV < DIVIDEND
       SUB     9,X
       LDA     3,X            ; CHK IF DIV < DIVIDEND
       SBC     8,X            ;
       LDA     2,X            ; CHK IF DIV < DIVIDEND
       SBC     7,X
       LDA     1,X           ;  CHK IF DIV < DIVIDEND
       SBC     6,X
       LDA     0,X
       SBC     5,X
       BCS     DIVERR
       CLR     SRCTMP
       INC     SRCTMP
DIV3210  TST   5,X
       BMI     DIV3212
       ASL      9,X           ; SHIFT UP DIVSR TO LINE UP WITH DIVDN
       ROL     8,X
       ROL     7,X
       ROL     6,X
       ROL     5,X
       INC     SRCTMP
        BRA     DIV3210
DIVLNE  LSR    5,X
       ROR     6,X
       ROR     7,X
       ROR     8,X
       ROR     9,X
DIV3212  LDA     4,X           ;  DIVDEND-DIVSR
       SUB     9,X
       STA     4,X
       LDA     3,X
       SBC     8,X
       STA     3,X
       LDA     2,X
       SBC     7,X
       STA     2,X
       LDA     1,X
       SBC     6,X
       STA     1,X
       LDA     0,X
       SBC     5,X
       STA     0,X
       BCS     DIV3220        ; DIDNT FIT
       SEC                    ;ADD BIT TO RESULT
DIV3240  ROL   14T,X
       ROL     13T,X
       ROL     12T,X
       ROL     11T,X
       ROL     10T,X
       DEC     SRCTMP       ; ALL DONE?
       BNE     DIVLNE         ; DIVIDE NEXT BIT
       CLC                    ; DONE OK
       RTS
DIV3220
       LDA     4,X         ;RESTORE DIVEND-DIVSR
       ADD     9,X
       STA     4,X
       LDA     3,X
       ADC     8,X
       STA     3,X
       LDA     2,X
       ADC     7,X
       STA     2,X
       LDA     1,X
       ADC     6,X
       STA     1,X
       LDA     0,X
       ADC     5,X
       STA     0,X
       CLC
       BRA     DIV3240
*******************************

 

0 Kudos