I'm looking to figure out whether I can get use an i.MX1176 to solve a tight-timing problem in software when reading a bus rather than resorting to an FPGA. Here's the bus timing

It's a 6502, so there's also the voltage-level-changing delays to concern me, but the main problem is the bottom trace (/MPD). I have to read the address lines, do a lookup in a 256-long table of A[15:8] and if the entry in that table is '1' rather than 0, assert /MPD, else ignore it and let it keep floating high.
The timing budget for this is ~45ns, given that A being active at 177ns from the clock-cycle probably translates to 180ns from the clock falling.
My plan was to use the M4 to poll the bus, dedicating it to the task of handling the bus signals, then M4 writes to shared memory, pings a semaphore, and lets the M7 actually process stuff. That is predicated on the delay to assert a GPIO being sufficiently rapid, but looking at this gave me some hope that would be possible - 10ns to toggle a GPIO is awesome.
Sadly, I'm not seeing that using my board. I've set the build to 'Release' and set it so that GPIO9(io 6) is the GPIO being toggled, set the slew rate to be fast...

and I'm calling:
PRINTF("Toggling.\r\n");
GPIO_PortToggle(GPIO9, 1<<6);
GPIO_PortToggle(GPIO9, 1<<6);
From looking at the library code, it seems to be doing the correct thing, assuming the inline is honored for a release build (in fact I checked this, and replaced the function calls to the direct register-set operations)...

But I'm only getting a 68ns period not a 10ns period like in the linked article...

I'm totally new to the i.mx series, I'm presumably doing something stupid, so if anyone has words of wisdom, I'm all ears...
One thing I'll say is that I don't seem to be able to get the 'play' button at the top of the window to actually run the code on the device - it seems to compile then hang. I'm actually running this using the 'Debug' operation (yes, on a release build) from the Quickstart panel, so possibly there's some overhead in that being the case ?