I'm trying to use a DEMOS12XDT512 board (80-pin QFP S12X CPU with an XGATE coprocessor) to drive a Sony TFT LCD. The LCD takes a clock input, and 9 bits of input per pixel (rrrgggbbb) every time the clock ticks (there are other inputs but those aren't pertinent at the moment). The suggested clock frequency is 4 MHz, but according to the timing diagrams anything above 1.6 MHz should be more than enough for 30+ frames per second.
My plan was to use a Periodic Interrupt Timer, with the service routine running in parallel on the XGATE, to tick the clock and load data from a frame buffer into GPIO (I'm using ports A and B) pins. The board has a 4 MHz oscillator, and I can crank the PLL multiplier up to over 20x without losing stability, so I figured it should be no problem writing the pins at ~4 MHz. But after testing my plan, it looks like the setup times on the GPIO pins are way too slow. Switching the clock pin and the nine pixel pins in the ISR yields a maximum frequency of less than 1 MHz, and that's not accounting for a lot of the other outputs and calculations the routine will need to handle. I've run a bunch of trials and it's clear that the pin write time is the bottleneck.
Is there a way to speed up these pins? Or alternatively, are there any other pins I can use for output (available in the 80-pin package) that will run faster?