Having expanded my testing to include placing the data segment into OCRAM (it was in DTCM), I was surprised that turning the processor's data cache off caused execution to slow. I, and presumably Mark, had assumed that being a part of FlexRAM that it would communicate directly with the CPU, bypassing its caches, but this is not the case.
Looking at figure 1 in AN12042, "Using the i.MXRT L1 Cache", it shows a purple line from OCRAM to SIM_M7 (bus fabric interconnect), which is I assume the path data to/from OCRAM takes between it and the CPU (I had previously assumed the two blue arrows to ITCM/DTCM also applied to OCRAM, but this is clearly not the case).
If you turn off the data cache, then OCRAM is around 65% faster than SDRAM in one benchmark test and a massive 257% faster in another. Similarly DTCM is 19% and 279% faster than OCRAM in the same two tests respectively. With the data cache enabled, performance between OCRAM and DTCM is almost identical.
Since OCRAM is slower than DTCM, then that's even more reason to prefer DTCM over OCRAM. However, in AN12077, "Using the i.MX RT FlexRAM", it mentions that OCRAM cannot be sized to 0 kB due to boot ROM code requirements (64 kB is the minimum), so it might make sense to use it post-boot for application data of some sort.