Thank you for clearing things up.
Any of this port related instructions will take 3 CPU cycles...
// ldr R3, =0400FF0C0h
str R4, [R3, #PDDR] ; 3 Dir In
str R2, [R3, #PDOR] ; 3 Data Out
ldr R2, [R3, #PDIR] ; 3 Data In
str R5, [R3, #PDDR] ; 3 Dir Out
I didn't know about IOPORT. There is only short paragraph inside family reference manual (39.5.3), without any examples. So I only need (for example) to replace 400F_F0C0h to F800_00C0h? And str instruction will take less than 3 cycles?
Edit: OK, I found it (39.4 FGPIO memory map and register definition) in RM.