Just for a check I did some timings for CCITT16
Using an 8 Mhz bus clock (16 Mhz CPU Clock) on a QG8 I got about 16.5 ms/K.
(not running in the debugger).
Don't know how much you are checking or your clock speed, but things should scale linearly.
I also discovered if you don't set _CHECKSUM_CRC_8 to 0, it will include that one as well by default, so you might check into that. It will end up doing both
You also do not need to copy checksum.c or h to you project directory, you can just add them from the src directory. The advantage to that is if they ever get bug fixes, you will get the fixes - the disadvantage is you may not care for the bug fixes.
If you choose the leave them there in the src folder, you will need to add this to cimplier command line args dialog:
-D_CHECKSUM_CRC_CCITT=1 -D_CHECKSUM_CRC_8=0
Also if you do things like this either way, you will not have change #define s in your code.
As you mentioned, it will only include generated code - not filled areas even though it seems you told it to.
If these are important to verify flash integrity, you could just loop on a compare for those areas, but if you use them later the code would have to be adjusted.