I managed to fix this myself. There must have been some kind of bus contention going on. I used PFCRP0 to prioritize z1's port over z0's (ARB=0, PRI=0) and then, for both PFCRP0 and PFCRP1, enabled line buffers (BFEN=1). The speed improvement was astonishing - z1 ran 6x faster even with z0 running. I couldn't find any documentation suggesting this might be necessary, not even in the dual core example.