I have to create a subroutine, which multiply 8Bit packed BCD(2 Digits)  with 20Bit packed BCD. (5 Digits). I don't know, which is the best possibility for code optimisation and speed optimisation. (...let say the best of all possibilities...)
Ok, on the one hand you could make a multiplication table of 81 - 100 Bytes. But this will take a lot of space in ROM.
Is there a fast and short algorithm for the HCS08?

