in my application,   4 bytes / 2 byte is need  as following,

dword     a;               /*   a  is type of unsigned long , 4 bytes  */

word       b;               /*   b is type of  unsigned integer, 2 bytes  */

word      c;


c =  a /b;


if I code it  in  C ,   c =a /b take more than 200 CPU cycles, it is too long for my application, so  i consider implimenting it  in assembly  language,  my CPU is HCS08, and only have one division instruction: 

    DIV                             /*    A<--(H:A) /(X);  H<-- remainder 6 CPU cycles , 16-bit by 8 bit divide instructions  */


how to implement  a  4 bytes  variable divide 2 bytes variable  in assembly  fastly?   thanks .