Entire RAM for execution: ramloader

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Entire RAM for execution: ramloader

912 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Sat Aug 17 12:47:52 MST 2013
I just finished my ramloader program for LPC800.
I can be used to load executable images into LPC800 RAM --THE ENTIRE RAM -- and then execute it from there.
No. The bootloader can't do this already. Definitely not.

Maybe you guys are all so cool and have such high end debuggers and tools that can do that since ages. But I do believe you're not so cool ;-)

Marc

Original Attachment has been moved to: ramloader800.s.txt.zip

Labels (1)
0 Kudos
12 Replies

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by Pacman on Fri Sep 13 23:12:22 MST 2013
Very nice.

I think there might be another small optimization:

ldr r2,=1<<14

That uses 4 bytes + one instruction.

If you load #1 into for instance r7 instead of r0 in the UART init, you could...
lsls r2,r0,#14

Then you'd save those 4 bytes.

It's very handy to always have a zero and a one in a register (if it can be afforded)

I like that you're using tst r0,r0. =)

You could also place this line...
bl uart0Write
...right before 'main' and then branch to it from your ramloader.* subroutines, then you'd save an instruction each time.
(I've seen 6 places this could be done, saving 5 instructions in total)
You could save one more instruction by placing the movs r0,#'A' right before the bl uart0Write, as #'A' is used twice.


The 4 of these in the address-reader...
bl uart0Read
orrs r6,r6,r0
rors r6,r6,r1


...could be reduced by...
getb:orrs r6,r6,r0
rors r6,r6,r1
b uart0Read

...and then calling 'getb' a few times:

bl uart0Read@ LSB
bl getb
bl getb@ MSB
bl getb

(It would be possible to save an extra instruction (movs #0,r6) if the address was transmitted in big endian instead of little endian, but you probably don't want to break compatibility). That could be done by using lsls instead of rors before inserting the byte.

This line in ramloader.execute...
ldr r0,=0xFFFFFFFF
...can definitely also save 2 bytes by...
movs r0,#0
subs r0,r0,#1
...but since r0 is already 0, you only need subs r0,r0,#1
That could be saved too, by changing the bne to a bhi
-So that would be 6 bytes saved in total. ;)

Of course I haven't seen all the possible optimizations; there's probably still plenty possible, and there might be some that you don't want to do (if using this last one mentioned, you'd have to be careful if changing the code).

Byte-reduction optimization is fun; I've done a lot of it when space was tight or I just wanted to reduce some memory-usage to the bare minimum.
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Fri Sep 13 16:34:27 MST 2013
Thanks for the comment, Pacman, I chose to remove the 'b.' !

I recently found out, there are at least 2 more things not optimal:

1. The LPC800 does have a VTOR register. I thought it doesn't.
2. Using the fractional baud rate generator the initialization of the PLL can be avoided.
   (USART clk div=1, FRG div = 255(=>256), FRG mul 22).

I need to change this, point 2 should save a few more bytes.

(EDIT:)
I did the changes. Here is the new version (code size 664B compared to 712B before).
Vector table remapping not tested, yet.
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by Pacman on Thu Sep 12 03:24:11 MST 2013
I ain't using no debugger; I guess I'm not cool enough to use GDB though I have it built...
But thanks for sharing your code.

bx r1
b .@ never reached !?

I believe you would want a bxl r1 instead of a bx. Otherwise just get rid of the 'b .'...
...You could perhaps just load LR instead of r1, and then bx to LR. Then the RAM-code would just loop itself if it ever tried to return to the caller...
...or you could zero LR, so the microcontroller would reset.
...or use the ADR pseudo-instruction to calculate the offset of the next line (but that would use extra flash memory).
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rickta59 on Wed Aug 21 14:59:32 MST 2013
Thanks!  I missed that link before.

-rick
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by lpcadmin on Wed Aug 21 08:30:45 MST 2013
Your posts seem to be getting extra attention for spam filtering from or provider (reason unknown).
However - your posts should no longer get placed into moderation from now on...
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Tue Aug 20 00:08:07 MST 2013
Once again, my message lost.... (Every edit is a full loss?!)

@rickta59: Yes, there is. I updated the html page describing ramloader (www.windscooting.com/softy/ramloader.html).
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Mon Aug 19 02:43:30 MST 2013

Quote: rickta59
Is there a host side program to support this? (ramloader src maybe)?


Yes. http://www.windscooting.com/softy/mxli.html
Download isp-2.0 from there (not another version). The folder 'programs' contains 'ramloader' amoung others.
'ramloader' accepts plain binary files and Intel-hex-files.
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rickta59 on Sun Aug 18 16:41:50 MST 2013
Is there a host side program to support this? (ramloader src maybe)?
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Sun Aug 18 08:25:32 MST 2013
Somehow my first reply got lost.

Loading all your code into RAM is much faster, the cycle 'compile - download - execute - see output' is within a second or two.
The RAM is very limited, so no big programs. But for experimenting with new peripherals it's great.
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Sun Aug 18 06:58:47 MST 2013
For more background info, see: http://www.windscooting.com/softy/ramloader.html
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by MarcVonWindscooting on Sun Aug 18 06:57:14 MST 2013
Because you don't want to wear down the FLASH for program development or because it's way faster, let's say a blink of an eye for the cycle 'compile - link - download - run - see output'.
For more information, see: http://www.windscooting.com/softy/ramloader.html
0 Kudos

781 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by cpldcpu on Sun Aug 18 05:08:01 MST 2013
100% asm, nice! ;-)

But of course, now I have to ask: Why would I need to load all my code into the ram?
0 Kudos