Follow-up to this problem:
I contacted Freescale tech support about this. They seem unsure about how this circuit actually operates and I was told that they would look into this, which they consider to be a documentation error. That was long ago and I never heard back anything about this.
Tech support also told me that the rx dma buffer size must be written to the BD's DataLength field before giving it to the dma engine, and the dma engine will get the buffer size info from the BD, rather than from the EMRBR register, as the documentation seems to imply.
To verify this, I tried the following experiment:
According to tech support, anything larger than a 128 byte rx packet should now get copied into two different data buffers. I sent a 200 byte packet to the FEC and guess what happened? The FEC writes all 200 data bytes to a single 128 byte data buffer, and the first BD in the ring has its LastInFrame flag set! I must conclude that EMRBR not only sets the max permitted size for rx packets, but it also defines the size of all rx dma buffers.
Unless I am missing something here, this is a bug, and it breaks the scatter/gather mechanism.
In summary, I have learned this: