I didn't try it on iMX8, from experience with parallel flash with older QuadSPI module:
Were you able to adjust in FlexSPI FCFB struct in QSPI flash at least serialClkFreq? So that FCFB is read at default low clock, then ROM switches to clearly higher clock specified in serialClkFreq and continues reading IVT and the rest of U-Boot at higher clock? At least you should start from verifying your changes in FCFB are applying.
When you are in touch with FCFB, try configuring FCFB for parallel mode. Don't forget about special parallel read command in LUT, which must provided via FCFB as well. When parallel FCFB is done, you need to program IVT and the rest of U-Boot image in parallel mode starting from right parallel mode flash offset. FCFB has to be programmed in single chip mode to just flash A (, or as well the copy of it in single chip mode to flash B, which won't be read by ROM anyway). Parallel mode, as you may know, interleaves data to A/B, FCFB should be readable from A in non parallel mode, so no data interleaving for FCFB.
For debugging JTAG access is very recommended to verify right IVT etc data is readable in QSPI address space in parallel mode. Though I don't how it will look in practice on iMX8. I hope FlexSPI isn't reset on faulty boot and it is possible to read QSPI AHB space with settings applied from FCFB. You may verify how it looks like in single FlexSPI mode.
Perhaps IVT is read along with FCFB in single mode, it is not very clear from iMX8RM. At least in the past only FCFB was read in single mode, the rest of image in parallel mode.
Edward