Joey
Good to hear that it was a simple mistake.
1000 Keys/s is possible if you just send one (changed) stroke a ms, rather than sending a stroke, key release, stroke (which I think the Arduino code does).
The uTasker keyboard interface uses a SW queue so code can send pre-defined 'strings' (eg. from a file defining such sequences) using
CHAR keySequence[] "Send this key sequence to the PC\nNow send another one, etc. etc. etc.";
fnWrite(keboard, buffer, (sizeof() - 1));
whereby the queue handles also key stroke code conversion and only sends key releases when two identical characters are sent in a row. It is interrupt driven and non-blocking for the application so - depending on your environment - it may or may not give additional general performance advantages.
The uTasker code may become more efficient once you needed to do futher development since it allows the USB and all processor operations to be simulated (removing HW debgging requirements), includes various other USB classes that can be instantly used as composites (such as USB-MSD, which Arduino can't do yet) plus USB host operation (if your HW were to support it).
Regards
Mark
Kinetis: http://www.utasker.com/kinetis.html
KL27: http://www.utasker.com/kinetis/FRDM-KL27Z.html / http://www.utasker.com/kinetis/Capuccino-KL27/Capuccino-KL27.html
For the complete "out-of-the-box" Kinetis experience and faster time to market