I recently learnt how to load code in RAM an execute it... And I was just wondering... Besides Bootloaders and Flash programming.. What are the advantages of running code from RAM? Or where would you recommend to do so?
Depending on your target and other hardware, RAM can be slower or much slower than flash. Executing from RAM your code may exetu faster.
Assuming a Von Neumann setup, if there were to be a speed disparity between between flash memory and (non-zero page) RAM, I might expect there would be either a different number of execution cycles for the same instruction and addressing mode, or a different maximum allowable clock rate. As far as I am aware, neither of these conditions apply to non-banked memory of either persuasion, within Freescale MCUs. Please correct me if I am wrong.
Another reason for executing code from RAM would be where temporary code is required, that can be loaded via BDM, SCI, etc. I assume that the P&E method of calibrating the internal reference oscillator uses this method. Any projects that require a factory calibration procedure might benefit from this approach, especially where flash resources are tight.
Right, there are no obvious reasons to have flash slower than RAM faster and manufacturers are trying to match speeds of flash, RAM and CPU, however with smaller than flash RAM arrays, it is possible to do some performance tricks:
- on S12 MCU 16bit memory access to word aligned address in flash takes 1 bus cycle, while misaligned word access takes 2 bus cycles. S12 RAM allows both aligned and misaligned write accesses in 1 bus cycle. So in fact RAM is bit faster. This doesn't make feasible to move code to RAM, because S12 instruction queue is there to make code fetched reading aligned words only. And if you have some data in flash, you can align it to make reads faster.
- on S12X MCU operating at 40MHz bus clock, XGATE core (interrupt coprocessor) is able to execute up to 2 instructions per bus cycle when executing from RAM, while it can do only up to 1 instruction per bus cycle when executing from flash. It It is about 2 times faster to execute XGATE code from RAM.
Regarding "different number of execution cycles for the same instruction and addressing mode". No, amount of cycles is the same, but some cycles are stretched with waitstates.
I used self-modifying code to allow the 68HC05 to download program code into a Xilinx FPGA. The array was of course greater in size than the 8-bit index register could handle. Storing ' LDA AdrHi AdrLow RTS' and calling it, fetched the array data. The data was sent via SPI. Then it was a simple matter to double-precision increment the Adr Low/Hi and check it against the end of the array. In this case it was necessary to use this method to keep the code size small.
The technique that you mention was required for HC705 MCUs because of the 8-bit X-register within these devices, and there was no user accessible stack. With a 16-bit index register available within the later device families, this approach would be rarely necessary. I also used a similar method for displaying variable length message string data located in PROM..
For 8-bit devices, with only a single index register, a generalised data transfer routine (in assembler) might slightly benefit in speed, without the need to temporarily store source and destination address indices to the stack.
Well, some interesting points have come here.
About the speed, I kind of guess that RAM would be faster than FLASH, but I didn't know it for sure. And I also believe that it depends on the manufacturer and the MCU.
Is good to know about the "trick" you mention. Is a nice technique and very useful when dealing with 8-bit pointer register.
Thanks for sharing!
Retrieving data ...