Jorge
If you are controlling displays via GPIO the speed is probably not of much consequence - often the accesses have to be slowed down (eg. character LCDs) otherwise the ports are too fast for the display, especially when using the faster devices.
Generally I would use FGPIO with the KL series (just set the GPIO block address to suit and the rest is the same).
There doesn't look to be any advantage of bit banding GPIO writes since the SET, CLEAR and TOGGLE registers do this anyway (with more flexibility). Bit banding reads could save a couple of instruction cycles when the state of one single bit is to be decided but generally the saving is unlikely to be critical.
Bit banding variables can have restrictions since only half or the RAM (RAM_U) is in the bitband region (RAM_L not). This means that variables need to be located correctly to start with.
Again there is a slight performance improvement when testing a bit in a variable and the possibility of performing a read-modify-write operation on a bit in memory, but again the saving is unlikely to be critical for anything more than very special cases.
In situations where it is advantageous to be able to manipulate bits in a variable (or efficiently test a bit in variable or GPIO) it is probably best to calculate the alias address of the bit in the specific variable (taking its address and calculating the corresponding alisas address and then the offset for the bit) just once at run time to create a pointer for further use - GPIO has a fixed alias so can use a fixed address - rather than getting the linker script involved. There should be no tool chain dependencies or restrictions.
Regards
Mark