Hi Jack,
I have one product that has been recently converted from using the GT16 to using the GT16A. This was a totally painless experience as there is really very little difference. Apart from accidently trying to use the algorithm for the 16 on a 16A that is! Some time ago I also accidently put the code and used the algorithm for the 16A into a 16 and it worked perfectly!
This led me to investigate the real difference between the algorithms. As far as I can tell on a preliminary investigation the only difference is that the 16 has a bug in it that does not actually cause any harm and that it had to be fixed when the 16A came along.
When the special blank check, that ignores security byte being set for unsecured, is done with the 16 algorithm it actually does the check over 32k of flash. With the 16A the illegal address reset probably catches this out. The only difference between the algorithms is that the correct start address for 16 k of flash is in the newer 16A version.
I am also developing some new products using the GT16A from scratch so am using the USB multilink for debugging more than usual lately and have not had any problems.
It works very well for me with no difference what device I connect to it.