Content originally posted in LPCWare by Pacman on Sat Nov 09 11:45:07 MST 2013
If starblue can handle 2 streams at 20MHz using 50% of the M0 core, I am convinced that the M4 core will be able to handle 3 streams at 50MHz.
Running from RAM is not difficult. I've done it in several ways:
Method 1: I made a linker-script, which placed the code in a section that is copied at reset.
Method 2: I copied an assembly-language routine using C.
Method 3: I copied an assembly-language routine using assembly-language.
Method 4: If using the M0, you can use the load_image(...) library function to copy code from the flash memory to SRAM and make the M0 core execute it.
If using the M4, most instructions are at 1 clock cycle (you can load/store multiple registers using the LDM/STM, those take 1 clk + 1 clk per register).
LDRx/STRx can be pipelined so they'll take one clk per instruction.
On the M0 core, things are different. LDM/STM still takes one clock cycle plus one clk per register, BUT you're restricted to incrementing the base register all the time.
LDRx/STRx always takes 2 clock cycles on the M0.
On the M0, you can basically only use r0-r7 (which is a bit too tight sometimes).
...And on the M0, you can't use fancy stuff like...
add r0,r4,r8,lsr#8
...You'd have to do that in several operations using 3 clock cycles instead of just 1.
Also, you should have in mind that the SGPIO data is least significant bit first, not most significant bit.