Hi Bernhard,
Thanks for your answer, and sorry for my late reply.
I got this to work as I wanted. My first implementation used a kind of trampoline outside of the memory I wanted to replace; I kept the M4 in a tight loop there while replacing the code, and then let it jump out of the loop when ready. That worked.
Later I tried to simplify stuff, and I found out that issueing a M4 core reset in RESET_CTRL just works:
LPC_RGU->RESET_CTRL[RGU_M3_RST >> 5]
does *not* remap the boot ROM to address 0 and does the job for me.
(Funny though, that the header files still call it 'M3_RST' instead of 'M4_RST')