As described elsewhere, the Linux kernel and all kernel modules are always built with optimization on. This is not just for performance reasons: the kernel code requires and assumes that it is optimized and will not function otherwise.

Unfortunately, debugging optimized code presents some extra challenges over normal debugging. In this article we describe some of key effects to be aware of, and offer some tips on how to debug optimised code effectively.

Code reordering

The most immediately obvious effect of debugging optimized code is that the compiler will often have reordered the code into a more efficient sequence. The effect of this is that GDB "step" commands do not always take you to the source line you are expecting — lengthy sections may be skipped, or you may even be taken backwards!

A good way to execute predictably to a given source line is to use the GDB command advance <line_number>. This is implemented by setting a temporary breakpoint on the line specified, continuing, and removing the breakpoint when it has been hit. In STWorkbench, there is an button to "Run to line", which uses this command.

Inlining

Another heavily-used optimization is inlining. This is where the code for short functions is inserted into the code of the calling function, to avoid the overhead of a function call. This speeds up execution at the expense of code size.

Inlining causes two common effects when debugging. First, using the next command to step over a function call may not work — you will be taken into the function anyway. In this case, there is little option but to step to the end of the inlined function. This is another occasion where the advance command is useful. Do not use the finish command to "step out" of the function as this will take you out of the parent function.

Secondly, if a function begins with an inlined function, hitting a breakpoint on the outer function may actually display the code of the inlined function in the debugger. In this case, step through (not out of) the inlined function to reach the outer function.

Optimized-away variables

Where possible, the compiler often optimizes away variables. This means that when the code can be implemented in assembler without actually creating the variable, it will do so. Examples of this are where a variable is never read (but may still be of interest to the developer debugging the code), or where a variable is actually a copy of another variable which is only read from, in which case the original can be used.

Sometimes it can be impossible to find the value of a variable that has been optimized away. In other cases, it can be found with a little investigation. For example, if gdb states:

(gdb) print i
No symbol "i" in current context.

Look for where i was declared — is it a copy of another variable? Perhaps it is a void* that has been casted to another type. It may be possible to print the original variable.

It is also worth remembering that on an sh4, after a function call the returned value will be placed in register r0, so in this case:

        int i;
        i = test_func();
        i++;

although i may be optimized away, on the third line you can print $r0 to see what was returned by test_func().

Note: this will only work if test_func() was not inlined! You can confirm that by checking that the disassembled code included a jump instruction.

Equally, the first four arguments of a function are placed in registers r4-r7 and can be examined there if necessary.

Tailcall optimization

Another optimization to be aware of is tailcall optimization. Sometimes, when the last thing a function does is call another function, returning the value it returns, the compiler adjusts the pr (return address) register so the called function returns directly to the parent of the function that called it. For example:

        int inner_function(void)
        {
                return 666;
        }
 
        int middle_function(void)
        {
                /* Do something... */
                return inner_function();
        }
 
        int outer_function(void)
        {
                /* Do something... */
                ret = middle_function();
                /* Do something else */
        }

In this case, the pr register may be adjusted to have inner_function() return directly to outer_function(). When this happens, frames appear to be missing from the backtrace:

        #0  inner_function () at hello.c:5
        #1  0x0040040a in outer_function () at hello.c:19


Good luck!