AnsweredAssumed Answered

Compiler Optimization

Question asked by LPCware Support on Mar 31, 2016

The GNU compiler offers a variety of different optimization options. This FAQ considers how these can be used and the effect that they have.

 

Compiling for better performance

 


The four basic optimization options offered by the GNU compiler are -O0,-O1,-O2 and -O3. These offer an increasing level of optimization such that -O0 carries out no optimization of the compiled code, whereas -O3 carries out the most optimization.

 

As the level of optimization increases, the compiler will attempt to produce better performing code. This may also have the effect of reducing the code size at levels -O1 and -O2 (compared to -O0).

 

However at level -O3, a number of additional optimization techniques are enabled which may produce higher performance code - but which are likely to also increase the code size.

 

Note: The LPCXpresso IDE Version 7 ships with an updated version of the GNU C Compiler 4.8.2 supporting a new option -Og : Optimize for debugging experience. -Og enables optimizations that do not interfere with debugging. It is intended to be used as the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. For more details see the FAQ "Optimize for Debug".

 

See below for how to modify a standard build configuration to use a different optimization level.

 

Compiling for better code size

 


As noted earlier, the above options are focused on improving the performance of your code. The higher optimization levels will often INCREASE code size, as the compiler will generate code that will run faster at the expense of code size, such as loop unrolling. If code size is your primary focus, then there is an additional option that may be of use, the -Os. This enables all -O2 optimizations that do not typically increase code size, but also performs further optimizations designed to reduce code size.

 

You can modify the optimization option used by a particular build configuration to -Os by selecting "MCU C Compiler - Optimization" in Project Properties and entering -Os into the "Other optimization options" field.

 

Actual code size and performance

 


The code size and performance results you will obtain for your particular source code for a particular optimization level are very much dependent on both the source code and the system that the code is executed it. It is well worth compiling at a number of optimization levels and seeing which provides the result closest to your ideal balance of code size and performance. The optimization level number gives an indication as to what the heuristics within the compiler's optimization engine will try to achieve for an average system. They are not a guarantee of the actual result.

 

Optimization and Build Configurations

 


With the LPCXpresso IDE, the Debug configuration defaults to building -O0, and the Release configuration typically defaults to -Os (which overrides the -O2 option which is also typically specified via the drop-down). You can modify the optimization option used by a particular build configuration as follows:

  1. Open the Project properties. There are a number of ways of doing this. For example, make sure the Project is highlighted in the Project Explorer view then open the menu "Project -> Properties".
  2. In the left-hand list of the Properties window, open "C/C++ Build" and select "Settings" and then the "Tool Settings" tab.
  3. Now choose "MCU C Compiler - Optimization" and select the required optimization level from the drop-down (removing the -Os from the Other optimizations list).

 

Optimized code fails to execute correctly

 


Very often the reason why code built with optimization fails to run correctly is actually down to the way that your code is written not being "optimization friendly". The two of the most common causes of problems are:

  1. Where variables which map onto memory mapped peripheral devices have not been marked as 'volatile'. With a debug build, such code can often work, as the compiler rarely optimizes memory accesses. But if such variables are not marked as 'volatile', a release build will generally optimise them out. For more information, please see:
  2. Timing loops. When compiled for release, if you have loops which simply count up to a particular value, for example waiting for a configuration register to change, the count can often no longer be sufficient when the code is compiled for Release due to the additional optimization. In addition, if the variables used in such loops are not marked as volatile, they may well be just optimized away!

 

Dead code elimination

 


The compiler will always carry out dead code elimination. Thus in this simple example:

 

 if (0) { /* Never true */
     printf("Dead Code\n");
}

 

the if' clause is never true and so the compiler will remove the whole statement from the generated code, even when compiled -O0.

 

Debugging of optimized code

 


At -O0, one or more machine instructions can effectively be mapped onto a specific source statement. However as the level of optimization carried out by the compiler increases, the mapping between source code and the generated machine instructions becomes much more complex. For example, this may lead to instructions originally generated from a number of source lines being merged or reordered.

 

One of the consequences of this is that when debugging optimized code, program behavior can sometimes be different to what might be expected from just looking at the original source code.

 

For example a breakpoint set on the first source statement of a loop might only get hit once, as the actual breakpoint may have been set on an initialization instruction that only executes the first time through the loop.

 

Another example is that sometimes stepping a single source statement may lead to the program stopping at the previous source statement - so that it appears that the program has executed backwards! However what has actually happened is that the next machine instruction is actually mapped to the previous source statement in the debug data (the part of the ELF image created by the compiler/link that is used by the debugger to map between source and executable).

 

When debugging your application, it can often therefore be sensible to debug using as low a level of optimization as possible (normally -O0) until you are sure that the program is correct algorithmically, and only then start to increase the optimization level. Note that you could do this for the whole project, or selectively using per-file properties.

 

Note: The LPCXpresso IDE Version 7 ships with an updated version of the GNU C Compiler 4.8.2 supporting a new option -Og : Optimize for debugging experience. -Og enables optimizations that do not interfere with debugging. It is intended to be used as the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. For more details see the FAQ "Optimize for Debug".

 

And finally...

 


This article is very much an overview and only covers the top-level compiler options. There are a number of other compiler optimization that you might want to consider using in particular circumstances. For more information on these, please see the GCC documentation within the LPCXpresso IDE built in help system.

 

 

Outcomes