Re: objtool/ORC generation for noreturn functions

From: Josh Poimboeuf
Date: Wed Jan 13 2021 - 13:43:26 EST


On Wed, Jan 13, 2021 at 11:44:22AM +0100, vanessa.hack@xxxxxx wrote:
> Hi,
> I am currently writing my final thesis at university on the topic of stack
> unwinding. My goal is to implement and evaluate stack unwinders for
> research operating system ports to x86 32 and 64 bit architectures and
> SPARC V8. 
> For the x86 ports I chose ORC as unwinding format due to its simplicity
> and reliability. So far, it works quite well (although I've ran into some
> minor issues with objtool as the research OS is written in C++). 
> But now I have some problems with functions that are explicitly marked as
> noreturn with the [[noreturn]] attribute, all following unwinding steps
> are unreliable. I have read in the objtool documentation that such
> functions have to be added to the objtool global_noreturn array.
> Unfortunately, I do not understand the purpose of that array and the
> intended ORC behaviour for noreturn functions. Are the unwinding steps
> that follow a noreturn intended to be unreliable? 

Hi Vanessa,

Nice thesis! I'm impressed (and a little surprised) that objtool/ORC is
working in a non-Linux environment. They were designed to be general
purpose, but we've added some Linux-isms to them over the years.
Congrats on getting that working.

What compiler is the OS built with?

As you've found, noreturn functions can be problematic. But they can be
unwinded through correctly, if handled carefully.


1) Objtool impact

Consider the following code pattern, generated by a C compiler:

func_A:
...
...
call some_noreturn_func

func_B:

If some_noreturn_func() were to return, func_A() would fall through to
func_B(), resulting in possibly disastrous undefined behavior. But
since some_noreturn_func() doesn't return, that can't happen. The
compiler knows it can't happen because of the noreturn attribute.

But if objtool doesn't know about the noreturn attribute, it assumes the
call can return, and execution can continue after it, resulting in the
fallthrough:

warning: objtool: func_A() falls through to next function func_B()

So that's the reason for the global_noreturn array. It lets objtool
know that execution doesn't continue after the call, so objtool can
follow the code flow intended by the compiler.

Note that in addition, objtool tries to detect calls to noreturn
functions in the same .o file, even if they don't have the noreturn
attribute. This matches GCC behavior, which automatically marks them as
noreturn even if they're missing the annotation.


2) ORC impact

Usually, an address on the stack is placed there by a call instruction,
which pushes the return address on the stack before jumping to the
called function. The return address is the instruction *after* the call
instruction. If you use that address to lookup the ORC entry, it will
be right most of the time, because the call instruction doesn't change
the stack layout, so the next instruction usually has the same stack
layout as the call instruction.

However, if the call is to a noreturn function, then the next
instruction might not have the same stack layout. For example, in the
above scenario with the call to some_noreturn_func(). After the call,
the address placed on the stack will be that of func_B(), because that
happens to be the instruction after the call. But func_B() probably has
a different layout, so passing the address of func_B() to the ORC lookup
will corrupt the unwind.

What you really want to use for the lookup is the address of the call
instruction itself. In the case of ORC you can just subtract one from
the address on the stack.

This is described in orc_unwind.c:

* For a call frame (as opposed to a signal frame), state->ip points to
* the instruction after the call. That instruction's stack layout
* could be different from the call instruction's layout, for example
* if the call was to a noreturn function. So get the ORC data for the
* call instruction itself.
*/
orc = orc_find(state->signal ? state->ip : state->ip - 1);

Notice there's one edge case where you *don't* subtract one from the
address. That's when the address is placed on the stack for a reason
*other* than a call.

That can happen in a "signal" frame, where an interrupt/signal handler
places the preempted task's registers on the stack. In that case the
ORC type is UNWIND_HINT_TYPE_REGS and the address is retrieved from
regs->sp, which is used as-is (without subtracting one), because there
was no call.


I hope that makes sense. Let me know if you have any more questions.

Also, please let me know when the paper is available to read :-)

--
Josh