GPIO Speed part 2 - Cortex M0

Hi guys,

very interesting discussion about the GPIO toggling speed. I am wondering, if i can use some of your  advices for the M0 (LPC111x familiy), too ? Currently i am struggling to  speed up the GPIO switching on a LPC1114, but the maximum rate i am  able to achieve is really slow, especially for turning an input to a  defined output (yes, RTFM was done, optimizations are enabled [-O3] :D).
Therefore i would like to ask the experts: Are there any special hints for the LPC111x part concerning the GPIOs ?

How about the new LPC12xx family ? I have just dived into the user  manual and it seems, that GPIO switching is much improved. Does anyone  have any experience ?