[RFC PATCH v6 0/3] arm64: Implement stack trace reliability checks

From: madvenka
Date: Wed Jun 30 2021 - 18:34:49 EST


From: "Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx>

Unwinder return value
=====================

Currently, the unwinder returns a tri-state return value:

0 means "continue with the unwind"
-ENOENT means "successful termination of the stack trace"
-EINVAL means "fatal error, abort the stack trace"

This is confusing. To fix this, define an enumeration of different return
codes to make it clear.

enum {
UNWIND_CONTINUE, /* No errors encountered */
UNWIND_ABORT, /* Fatal errors encountered */
UNWIND_FINISH, /* End of stack reached successfully */
};

Reliability checks
==================

There are a number of places in kernel code where the stack trace is not
reliable. Enhance the unwinder to check for those cases.

Return address check
--------------------

Check the return PC of every stack frame to make sure that it is a valid
kernel text address (and not some generated code, for example).

Low-level function check
------------------------

Low-level assembly functions are, by nature, unreliable from an unwinder
perspective. The unwinder must check for them in a stacktrace. See the
"Assembly Functions" section below.

Other checks
------------

Other checks may be added in the future. Once all of the checks are in place,
the unwinder can provide a reliable stack trace. But before this can be used
for livepatch, some other entity needs to validate the frame pointer in kernel
functions. objtool is currently being worked on to address that need.

Return code
-----------

If a reliability check fails, it is a non-fatal error. The unwinder needs to
return an appropriate code so the caller knows that some non-fatal error has
occurred. Add another code to the enumeration:

enum {
UNWIND_CONTINUE, /* No errors encountered */
UNWIND_CONTINUE_WITH_RISK, /* Non-fatal errors encountered */
UNWIND_ABORT, /* Fatal errors encountered */
UNWIND_FINISH, /* End of stack reached successfully */
};

When the unwinder returns UNWIND_CONTINUE_WITH_RISK:

- Debug-type callers can choose to continue the unwind

- Livepatch-type callers will stop unwinding

So, arch_stack_walk_reliable() (implemented in the future) will look like
this:

/*
* Walk the stack like arch_stack_walk() but stop the walk as soon as
* some unreliability is detected in the stack.
*/
int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
void *cookie, struct task_struct *task)
{
struct stackframe frame;
enum unwind_rc rc;

if (task == current) {
rc = start_backtrace(&frame,
(unsigned long)__builtin_frame_address(0),
(unsigned long)arch_stack_walk_reliable);
} else {
/*
* The task must not be running anywhere for the duration of
* arch_stack_walk_reliable(). The caller must guarantee
* this.
*/
rc = start_backtrace(&frame,
thread_saved_fp(task),
thread_saved_pc(task));
}

while (rc == UNWIND_CONTINUE) {
if (!consume_entry(cookie, frame.pc))
return -EINVAL;
rc = unwind_frame(task, &frame);
}

return rc == UNWIND_FINISH ? 0 : -EINVAL;
}

Assembly functions
==================

There are a number of assembly functions in arm64. Except for a couple of
them, these functions do not have a frame pointer prolog or epilog. Also,
many of them manipulate low-level state such as registers. These functions
are, by definition, unreliable from a stack unwinding perspective. That is,
when these functions occur in a stack trace, the unwinder would not be able
to unwind through them reliably.

Assembly functions are defined as SYM_FUNC_*() functions or SYM_CODE_*()
functions. objtool peforms static analysis of SYM_FUNC functions. It ignores
SYM_CODE functions because they have low level code that is difficult to
analyze. When objtool becomes ready eventually, SYM_FUNC functions will
be analyzed and "fixed" as necessary. So, they are not "interesting" for
the reliable unwinder.

That leaves SYM_CODE functions. It is for the unwinder to deal with these
for reliable stack trace. The unwinder needs to do the following:

- Recognize SYM_CODE functions in a stack trace.

- If a particular SYM_CODE function can be unwinded through using
some special logic, then do it. E.g., the return trampoline for
Function Graph Tracing.

- Otherwise, return UNWIND_CONTINUE_WITH_RISK.

Current approach
================

Define an ARM64 version of SYM_CODE_END() like this:

#define SYM_CODE_END(name) \
SYM_END(name, SYM_T_NONE) ;\
99: ;\
.pushsection "sym_code_functions", "aw" ;\
.quad name ;\
.quad 99b ;\
.popsection

The above macro does the usual SYM_END(). In addition, it records the
function's address range in a special section called "sym_code_functions".
This way, all SYM_CODE functions get recorded in the section automatically.

Implement an early_initcall() called init_sym_code_functions() that allocates
an array called sym_code_functions[] and copies the function ranges from the
section to the array.

Add a reliability check in unwind_check_frame() that compares a return
PC with sym_code_functions[]. If there is a match, then return
UNWIND_CONTINUE_WITH_RISK.

Call unwinder_is_unreliable() on every return PC from unwind_frame(). If there
is a match, then return UNWIND_CONTINUE_WITH_RISK.

Last stack frame
================

If a SYM_CODE function occurs in the very last frame in the stack trace,
then the stack trace is not considered unreliable. This is because there
is no more unwinding to do. Examples:

- EL0 exception stack traces end in the top level EL0 exception
handlers.

- All kernel thread stack traces end in ret_from_fork().

Special SYM_CODE functions
==========================

The return trampolines of the Function Graph Tracer and Kretprobe can
be recognized by the unwinder. If the return PCs can be translated to the
original PCs, then, the unwinder should perform that translation before
checking for reliability. The final PC that we end up with after all the
translations is the one we need to check for reliability.

Accordingly, I have moved the reliability checks to after the logic that
handles the Function Graph Tracer.

So, the approach is - all SYM_CODE functions are unreliable. If a SYM_CODE
function is "fixed" to make it reliable, then it should become a SYM_FUNC
function. Or, if the unwinder has special logic to unwind through a SYM_CODE
function, then that can be done.

Special cases
=============

Some special cases need to be mentioned:

- EL1 interrupt and exception handlers end up in sym_code_ranges[].
So, all EL1 interrupt and exception stack traces will be considered
unreliable. This the correct behavior as interrupts and exceptions
can happen on any instruction including ones in the frame pointer
prolog and epilog. Unless objtool generates metadata so the unwinder
can unwind through these special cases, such stack traces will be
considered unreliable.

- A task can get preempted at the end of an interrupt. Stack traces
of preempted tasks will show the interrupt frame in the stack trace
and will be considered unreliable.

- Breakpoints are exceptions. So, all stack traces in the break point
handler (including probes) will be considered unreliable.

- All of the ftrace trampolines end up in sym_code_functions[]. All
stack traces taken from tracer functions will be considered
unreliable.
---
Changelog:

v6:
From Mark Rutland:

- The per-frame reliability concept and flag are acceptable. But more
work is needed to make the per-frame checks more accurate and more
complete. E.g., some code reorg is being worked on that will help.

I have now removed the frame->reliable flag and deleted the whole
concept of per-frame status. This is orthogonal to this patch series.
Instead, I have improved the unwinder to return proper return codes
so a caller can take appropriate action without needing per-frame
status.

- Remove the mention of PLTs and update the comment.

I have replaced the comment above the call to __kernel_text_address()
with the comment suggested by Mark Rutland.

Other comments:

- Other comments on the per-frame stuff are not relevant because
that approach is not there anymore.

v5:
From Keiya Nobuta:

- The term blacklist(ed) is not to be used anymore. I have changed it
to unreliable. So, the function unwinder_blacklisted() has been
changed to unwinder_is_unreliable().

From Mark Brown:

- Add a comment for the "reliable" flag in struct stackframe. The
reliability attribute is not complete until all the checks are
in place. Added a comment above struct stackframe.

- Include some of the comments in the cover letter in the actual
code so that we can compare it with the reliable stack trace
requirements document for completeness. I have added a comment:

- above unwinder_is_unreliable() that lists the requirements
that are addressed by the function.

- above the __kernel_text_address() call about all the cases
the call covers.

v4:
From Mark Brown:

- I was checking the return PC with __kernel_text_address() before
the Function Graph trace handling. Mark Brown felt that all the
reliability checks should be performed on the original return PC
once that is obtained. So, I have moved all the reliability checks
to after the Function Graph Trace handling code in the unwinder.
Basically, the unwinder should perform PC translations first (for
rhe return trampoline for Function Graph Tracing, Kretprobes, etc).
Then, the reliability checks should be applied to the resulting
PC.

- Mark said to improve the naming of the new functions so they don't
collide with existing ones. I have used a prefix "unwinder_" for
all the new functions.

From Josh Poimboeuf:

- In the error scenarios in the unwinder, the reliable flag in the
stack frame should be set. Implemented this.

- Some of the other comments are not relevant to the new code as
I have taken a different approach in the new code. That is why
I have not made those changes. E.g., Ard wanted me to add the
"const" keyword to the global section array. That array does not
exist in v4. Similarly, Mark Brown said to use ARRAY_SIZE() for
the same array in a for loop.

Other changes:

- Add a new definition for SYM_CODE_END() that adds the address
range of the function to a special section called
"sym_code_functions".

- Include the new section under initdata in vmlinux.lds.S.

- Define an early_initcall() to copy the contents of the
"sym_code_functions" section to an array by the same name.

- Define a function unwinder_blacklisted() that compares a return
PC against sym_code_sections[]. If there is a match, mark the
stack trace unreliable. Call this from unwind_frame().

v3:
- Implemented a sym_code_ranges[] array to contains sections bounds
for text sections that contain SYM_CODE_*() functions. The unwinder
checks each return PC against the sections. If it falls in any of
the sections, the stack trace is marked unreliable.

- Moved SYM_CODE functions from .text and .init.text into a new
text section called ".code.text". Added this section to
vmlinux.lds.S and sym_code_ranges[].

- Fixed the logic in the unwinder that handles Function Graph
Tracer return trampoline.

- Removed all the previous code that handles:
- ftrace entry code for traced function
- special_functions[] array that lists individual functions
- kretprobe_trampoline() special case

v2
- Removed the terminating entry { 0, 0 } in special_functions[]
and replaced it with the idiom { /* sentinel */ }.

- Change the ftrace trampoline entry ftrace_graph_call in
special_functions[] to ftrace_call + 4 and added explanatory
comments.

- Unnested #ifdefs in special_functions[] for FTRACE.

v1
- Define a bool field in struct stackframe. This will indicate if
a stack trace is reliable.

- Implement a special_functions[] array that will be populated
with special functions in which the stack trace is considered
unreliable.

- Using kallsyms_lookup(), get the address ranges for the special
functions and record them.

- Implement an is_reliable_function(pc). This function will check
if a given return PC falls in any of the special functions. If
it does, the stack trace is unreliable.

- Implement check_reliability() function that will check if a
stack frame is reliable. Call is_reliable_function() from
check_reliability().

- Before a return PC is checked against special_funtions[], it
must be validates as a proper kernel text address. Call
__kernel_text_address() from check_reliability().

- Finally, call check_reliability() from unwind_frame() for
each stack frame.

- Add EL1 exception handlers to special_functions[].

el1_sync();
el1_irq();
el1_error();
el1_sync_invalid();
el1_irq_invalid();
el1_fiq_invalid();
el1_error_invalid();

- The above functions are currently defined as LOCAL symbols.
Make them global so that they can be referenced from the
unwinder code.

- Add FTRACE trampolines to special_functions[]:

ftrace_graph_call()
ftrace_graph_caller()
return_to_handler()

- Add the kretprobe trampoline to special functions[]:

kretprobe_trampoline()

Previous versions and discussion
================================

v5: https://lore.kernel.org/linux-arm-kernel/20210526214917.20099-1-madvenka@xxxxxxxxxxxxxxxxxxx/
v4: https://lore.kernel.org/linux-arm-kernel/20210516040018.128105-1-madvenka@xxxxxxxxxxxxxxxxxxx/
v3: https://lore.kernel.org/linux-arm-kernel/20210503173615.21576-1-madvenka@xxxxxxxxxxxxxxxxxxx/
v2: https://lore.kernel.org/linux-arm-kernel/20210405204313.21346-1-madvenka@xxxxxxxxxxxxxxxxxxx/
v1: https://lore.kernel.org/linux-arm-kernel/20210330190955.13707-1-madvenka@xxxxxxxxxxxxxxxxxxx/
Madhavan T. Venkataraman (3):
arm64: Improve the unwinder return value
arm64: Introduce stack trace reliability checks in the unwinder
arm64: Create a list of SYM_CODE functions, check return PC against
list

arch/arm64/include/asm/linkage.h | 12 ++
arch/arm64/include/asm/sections.h | 1 +
arch/arm64/include/asm/stacktrace.h | 16 ++-
arch/arm64/kernel/perf_callchain.c | 5 +-
arch/arm64/kernel/process.c | 8 +-
arch/arm64/kernel/return_address.c | 10 +-
arch/arm64/kernel/stacktrace.c | 180 ++++++++++++++++++++++++----
arch/arm64/kernel/time.c | 9 +-
arch/arm64/kernel/vmlinux.lds.S | 7 ++
9 files changed, 213 insertions(+), 35 deletions(-)


base-commit: bf05bf16c76bb44ab5156223e1e58e26dfe30a88
--
2.25.1