Usually MU is not used to send data directly, but just the notification.
In our current Linux BSP, the MU is used for RPMSG.
The data exchange happens in the shared DDR memory which is reseved for RPMSG
and MU is used to send notice from one processor to another about when to access the memory and get the data.
Therefore it's hard to evaluated the performance of MU alone since it's just a messenger.