Hi,
first important point: due to pipelining, flash wait states, cache, other bus masters and priorities on crossabar etc., it's not easy to generate exact delays using asm or C code. This was easy on older and simpler devices where the timing of everything was strictly given. But features which increase the performance insert relatively high variability when talking about timing.
So, my usual recommendation is: if the timing is not critical and you can accept relatively high inaccuracy, use just simple delay loops like:
for(i=0; i<delay; i++)
__asm("nop");
... or whatever like this. I can't see a benefit to write this in asm.
If the timing is critical, use a timer like PIT or STM to generate accurate delays.
Example MPC5744P STM timer S32DS Power 2017.R1
Example MPC5744P PIT triggering interrupts GHS614
And to asnwer your original question, here is an example of inline asm syntax (writing value 0xA1000F00 to SPR register 624 using r3 register):
__asm("e_lis %r3, 0xA100");
__asm("e_or2i %r3, 0x0F00");
__asm("mtspr 624, %r3");
Regards,
Lukas