CMSIS text size

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

CMSIS text size

4,155 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by mcu_programmer on Fri Apr 23 07:56:12 MST 2010
Have any of you noticed how large the text flash memory usage is for using CMSIS? On a LPC1111 it uses almost up all the flash memory.
0 Kudos
Reply
21 Replies

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by renan on Thu Jul 08 11:31:57 MST 2010

Quote:

If you are referring to potential debug issues when --gc-sections used ......



Yes, that's what I was talking about.

Thanks,

Renan
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Thu Jul 08 05:09:25 MST 2010

Quote: renan
Any news on the linker option --gc-sections?


If you are referring to potential debug issues when --gc-sections used, then yes, the recently released LPCXpresso 3.4 should resolve these.

http://knowledgebase.nxp.com/showthread.php?t=483

Newly created projects should now have this linker option enabled. You will need to add it to your existing projects (and some of the examples) - depending on what version of LPCXpresso they were created with.

To enable --gc-sections, go to

Project Properties->C/C++ Build->Settings->MCU   Linker->Miscellaneous

and add the option into the "Other options  (-Xlinker [option])" box.

Regards,
CodeRedSupport
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by renan on Thu Jul 08 04:08:42 MST 2010
Any news on the linker option --gc-sections?

Renan
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Tue Apr 27 05:24:33 MST 2010
One additional hint for carrying out debug builds. The CMSIS function SystemCoreClockUpdate() in system_LPC13xx.c (or equivalent for your MCU) is rarely used but consumes about 800bytes when built -O0 for Cortex-M3.

If your code is not using this function, and your Debug build does not build your sources [FONT=Courier New]-ffunction-sections[/FONT] and link [FONT=Courier New]--gc-sections[/FONT] (see my previous post to this thread, 04-26-2010 11:26 AM, for details), then if you need to reduce code size of your Debug build to fit it into your flash, then commenting out this function from your CMSIS project will make a noticeable difference....

#ifdef INCLUDE_SYSTEMCORECLOCKUPDATE
void SystemCoreClockUpdate (void)            /* Get Core Clock Frequency      */
{
  uint32_t wdt_osc = 0;
  :
  :
  SystemCoreClock /= LPC_SYSCON->SYSAHBCLKDIV;  
}
#endif 
As I said previously, we continue to investigate ways of driving code sizes down for applications built with LPCXpresso IDE.

Regards,
CodeRedSupport
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Mon Apr 26 08:18:19 MST 2010

Quote: igorsk
Thanks for the reply. One note: everything I reported applies to the Release builds, with --gc-sections and -ffunction-sections.


OK, but the original thread starter did appear to be encountering code size issues with Debug builds.

Quote: igorsk
Re: NVIC_EnableIRQ
I had copies of the function in the Release build. I later turned on -Winline and got the message that the inlining instruction limit was reached. I could not find any setting for the default, but adding -finline-limit=1000 did get it to inline. Might be worth investigating too.


The compiler will not always inline a function, even when they are marked in the sources as inlineable and you are compiling with optimisation turned on. In such circumstances, you may well end up with two static copies of an inline function from a header file in two different objects. But without more information, it is hard to guess as to the exact circumstances that you are seeing here. Is there a chance that you could provide a project that shows this up?


Quote: igorsk

Also, are Redlib libraries built with optimization enabled? Some of the code I've seen looked somewhat verbose. Although I guess it could be because of limited Thumb-1 ISA.


Yes, Redlib is built with optimisation turned on :)

Regards,
CodeRedSupport
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by igorsk on Mon Apr 26 03:54:29 MST 2010
Thanks for the reply. One note: everything I reported applies to the Release builds, with --gc-sections and -ffunction-sections.

Re: NVIC_EnableIRQ
I had copies of the function in the Release build. I later turned on -Winline and got the message that the inlining instruction limit was reached. I could not find any setting for the default, but adding -finline-limit=1000 did get it to inline. Might be worth investigating too.

Also, are Redlib libraries built with optimization enabled? Some of the code I've seen looked somewhat verbose. Although I guess it could be because of limited Thumb-1 ISA.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Mon Apr 26 03:26:46 MST 2010
Sorry for the delay in contributing further to this thread. Here are a few comments on some of the issues that have been raised....

[U][B]Unused Code elimination[/B][/U]

A big contribution to this difference can often be caused by whether the --gc-sections linker option is being used. When this option is turned on, the linker will remove unused sections from objects that are being linked together. This can typically reduce the overall code size of your application quite considerably. Note that the compiler options  -ffunction-sections and -fdata-sections are required on the compile step to cause code/data to be placed into separate sections in an object file created from a c file.

In the original release of LPCXpresso IDE, the --gc-sections was specified for both the Debug and Release configurations of newly created projects (and hence in the provided examples). However a number of users saw problems when debugging Debug builds - typically slightly strange behaviour when single stepping and the viewing of variables.

As the whole point of the Debug configuration is to debug, we therefore took the decision to remove the --gc-sections option from the link step of Debug Configurations in more recent versions of the LPCXpresso IDE, whilst we in the background looked the issues with debugging --gc-sections linked code. This unfortunately means that code size of Debug builds does increase.

One thing that you could therefore do is to add back in the linker --gc-sections, but accept that you may encounter debugging oddities as a result.

A slight modification to this that appears (from my initial tests at least) to allow debugging without such debug oddities, whilst still removing unused C library sections. To do this, add the --gc-sections option to the link step of your application, but ensure that you do **NOT** specify the -ffunction-sections or -fdata-sections in your compile steps for your application (or in any "local" library projects that your application uses, such as CMSIS).

Using this mechanism can make a noticeable difference to the size of a Debug build of your application. For example, building the LPCXpresso1114_systick_twinkle example from the LPCXpresso1114.zip in the LPCXpresso IDE 3.3.4 release gives the following binary sizes....

8568 bytes   - as delivered, no --gc-sections
3540 bytes   - with --gc-sections
5592 bytes   - with --gc-sections, but no -ffunction-sections/-fdata-sections

You can see that using --gc-sections, but with no -ffunction-sections/-fdata-sections compiler options gives a useful saving.

Note that if you look in the .map file that is created automatically by your build, you can see the discarded input sections.

[U][B]Redlib libraries and --gc-sections[/B][/U]

Most of the Redlib C libraries are written in C and have been build with -ffunction-sections. This is what allows the code size reduction described previously when the link step uses --gc-sections. However the test case that igorsk found with helpers.o is actually written is assembler (and hence -ffunction-sections has no effect), and is mainly made up of "wrapper functions", which are not been written into separate sections in the current version. We will look into doing this in a future version of the tools.

[U][B]Multiple copies of NVIC_EnableIRQ[/B][/U]

NVIC_EnableIRQ() is implemented by CMSIS as a static inline function within the header file core_cm0.h. This is done so that the compiler will inline the code of the function, rather than making a subroutine call. This will typically have code size and performance benefits. However the inlining will not take place when you do a Debug build (optimisation level = -O0) - and thus you will end up with a static copy of NVIC_EnableIRQ() in every object where a call to it is made.

I can see that this is potentially not ideal, and we will investigate potential improvements in future release of the CMSIS libraries. [Note that CodeRed are not the maintainers of the CMSIS library sources, but we do work closely with ARM/Keil (and other contributors) on this.]

[U][B]Code size and __main[/B][/U]

We are currently investigating a number of possible ways of decreasing the contribution to code size from RedLib, including the calls made by __main. As previously stated you can cause some reductions by bypassing the call from the startup code to __main and calling main directly. However we would not recommend this, and if you do do it be aware of potential issues this may cause you!

Regards,
CodeRedSupport
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rkiryanov on Sun Apr 25 23:41:40 MST 2010

Quote: igorsk



You right, memory allocation functions are referenced in the vtable. CodeRedSupport, can you provide CRT with stubs instead of memory allocation functions?
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by igorsk on Sun Apr 25 10:17:09 MST 2010
Well, I think I've figured out the cause.
If I objdump one of the object files produced from the blinky's sources, I get this:
gpio.o:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000000  00000000  00000000  00000034  2**1
[...]
  7 .text.NVIC_EnableIRQ 00000014  00000000  00000000  00005308  2**2
  8 .text.GPIOInit 0000002c  00000000  00000000  0000531c  2**2
  9 .text.GPIOSetInterrupt 00000180  00000000  00000000  00005348  2**2
 10 .text.GPIOIntEnable 00000058  00000000  00000000  000054c8  2**2
 11 .text.GPIOIntDisable 00000054  00000000  00000000  00005520  2**2
 12 .text.GPIOIntStatus 00000054  00000000  00000000  00005574  2**2
 13 .text.GPIOIntClear 00000058  00000000  00000000  000055c8  2**2

As you can see, every function got placed into a separate function, as requested by the -ffunction-sections switch.
If I dump one of the CRT object files (helpers.o from libcr_eabihelpers.a), I see this:
helpers.o:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000088  00000000  00000000  00000034  2**2
  1 .data         00000000  00000000  00000000  000000bc  2**0
  2 .bss          00000000  00000000  00000000  000000bc  2**0
  3 .ARM.attributes 00000020  00000000  00000000  000000bc  2**0

All functions are in one .text segment.

So, the problem is: (some?) CRT files have not been compiled with -ffunction-sections, and if you use one function (blinky uses __aeabi_uidiv and __aeabi_uidivmod) from such, the whole file gets pulled in.

I still haven't figured out how to get rid of the duplicate NVIC_EnableIRQ.

BTW, __ARM_switch8 from helpers.o contains some ARM code... I'm not sure if it's ever used by the actual code, but if it is, it's not going to work on the Cortex-M chips.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rkiryanov on Sun Apr 25 10:08:38 MST 2010

Quote: igorsk
I guess it's playing safe and it can't really remove them since they're referenced from the vtable.



As far as I know, "operator delete" is selected statically by type of its operand, i.e.
T * p; delete p;

"::operator delete" or "T::operator delete", depending if T overrides it. It is not referenced by vtable. "::operator delete" and 'dctor are dirrerent things.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rkiryanov on Sun Apr 25 09:40:23 MST 2010

Quote: igorsk
it can't really remove them since they're referenced from the vtable.



The problem is: GCC does not remove unreferenced "operator delete" and all its depends (such as malloc-free-_sbrk-etc) just like in your example.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by igorsk on Sun Apr 25 09:01:30 MST 2010

Quote: rkiryanov
Same "feature": http://knowledgebase.nxp.com/showthread.php?t=89&page=2#13 :(


That one is a bit different. GCC seems to always stick two (sometimes even three) virtual destructors into the vtable once you have virtual functions. I guess it's playing safe and it can't really remove them since they're referenced from the vtable.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by rkiryanov on Sun Apr 25 07:16:12 MST 2010

Quote: igorsk
- unused code is not removed



Same "feature": http://knowledgebase.nxp.com/showthread.php?t=89&page=2#13 :(
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by igorsk on Sat Apr 24 17:08:21 MST 2010
So I opened a blinky binary for LPC11xx built with the default Release settings (text=2720 bytes) and I see some quite strange things:
- unused code is not removed (__ARM_call_via_r0, __aeabi_idivmod etc).
- NVIC_EnableIRQ is present twice (!). One is called from GPIOInit, another from init_timer32.
I've checked that both -ffunction-sections and --gc-sections options are present.
Any idea what's going on?
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by igorsk on Sat Apr 24 16:33:25 MST 2010
Here's what __main does for v6-m:
 PUSH    {R4,LR}
 BL      _ctype_init
 BL      _init_alloc
 BL      main
 POP     {R4}
 POP     {R0}
 BX      R0

So, if you don't use ctype functions (tolower() etc.) or heap, it seems you don't actually need it.

(BTW, the code looks not optimized. There's no need to save R4 or LR.)
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by mcu_programmer on Sat Apr 24 13:51:57 MST 2010
Do all library functions need initialization? For example division. Or does __main() do more than initialization and call main()?

In other words, if you directly call main from startup, skipping __main(),  and still use for example division, would it work?

This I can test, of course, to see if it works; however it is nice to know whats going on on the theoritical side too.

Do you have the source code for __main()? Then there would be an opportunity to optimize size of __main() for a particular use.

1.3k bytes that inclusion of __main() causes is significant in terms of size for an 8k flash such as LPC1111, in my view.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Sat Apr 24 12:26:16 MST 2010
Yes, if you *never* make any C library calls then you can call main() directly from the startup code.

If you want to make doubly sure, you can change the Linker (Target section) to use "No libraries". Beware though, that if you use anything that requires compiler helper functions (such as division) then the link will fail with unresolved external errors.
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by mcu_programmer on Sat Apr 24 10:13:57 MST 2010
A few comments.

Ok, the main is not called if I comment away it; but the main in my case is not very large (the initial code from the wizard), so it could not explain the difference.

However I tried this,

I modified the startup again, it looks like this at the main calling,

#if defined (__REDLIB__XX)
// Call the Redlib library, which in turn calls main()
__main() ;
#else
main();
#endif

(I added the XX)
So the main() should be called in the else section. The main is the standard code created by the wizard: incrementing "i" in a while loop. (not calling any redlib libraries)
With the XX added above the code compiles to 436 (dec) bytes.

1824 (dec) bytes is used if the original is used (without the XX)

I guess my question is this, me being used to other microcontroller projects where one does not have to call any initializing functions just programming from scratch:

If I create a new project and choose "C project" and then "empty project", what do I have to code myself to get the code working in the microcontroller (I do not plan to use any redlib functions)?

One thing I can think of is setting up the clocking/pll options to get the right timing. Is there anything else that is useful, or will the code run just by doing this?

Another thing that seems necessary is to set up the vector table,

__attribute__ ((section(".isr_vector")))
void (* const g_pfnVectors[])(void) =
{
...
}

If one does these two things, will that suffice for a project based on the "empty project" wizard choosing.

Another way to put the question,

if I create a "simple C project" from the wizard, and exclude the __main() calling and instead call the main() (to save text usage), will that be a way to a fruitful project that does not require any library functions?
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by CodeRedSupport on Sat Apr 24 09:30:47 MST 2010
__main() initialises (the *used* functions  of) the C library and then calls main().

If you don't call __main() then *your* main() will not be called either. The linker will spot this and so discard your main and everything that it calls. So the 75% code reduction you are seeing is because it is discarding your code too. Thus you will end up with a very small program that does nothing...

In other words, do not delete the call to __main()!
0 Kudos
Reply

4,068 Views
lpcware
NXP Employee
NXP Employee
Content originally posted in LPCWare by mcu_programmer on Fri Apr 23 14:52:38 MST 2010
dramatic reduction in code size. Still considering the source code length in CMSIS and startup, 2k seems a bit lot. When I commented away the __main() call, 75% reduction in code size.

What actually does the __main() function in redlib do?
0 Kudos
Reply