[PATCH] x86/unwind/orc: add ELF section with ORC version number

From: Omar Sandoval
Date: Thu Jun 08 2023 - 18:38:48 EST


From: Omar Sandoval <osandov@xxxxxx>

Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC
metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in
two") changed the ORC format. Although ORC is internal to the kernel,
it's the only way for external tools to get reliable kernel stack traces
on x86-64. In particular, the drgn debugger [1] uses ORC for stack
unwinding, and these format changes broke it [2]. As the drgn
maintainer, I don't care how often or how much the kernel changes the
ORC format as long as I have a way to detect the change. Using the
kernel version is not a solution because distros frequently backport
changes.

It suffices to store a version number for the ORC format in the vmlinux
and kernel module ELF files (to use when parsing ORC sections from ELF),
and in kernel memory (to use when parsing ORC from a core dump). This
patch adds both of these by creating an .orc_header ELF section
containing a 4-byte version number and the corresponding
__start_orc_header and __stop_orc_header symbols.

The current version number is 3. Version 1 is the original version
merged in commit ee9f8fce9964 ("x86/unwind: Add the ORC unwinder").
Version 2 is the version from commit ffb1b4a41016 ("x86/unwind/orc: Add
'signal' field to ORC metadata"), which obviously didn't include this
header but could get it in a backport to the 6.3 stable branch.

1: https://github.com/osandov/drgn
2: https://github.com/osandov/drgn/issues/303

Signed-off-by: Omar Sandoval <osandov@xxxxxx>
---
Hi,

As mentioned in the commit message, the motivation for this patch is
allowing drgn to continue to make use of ORC for kernel stack unwinding.

I want to make it clear that I don't want ORC to be stable ABI. The
kernel is free to change the format as much as needed, I just need a way
to detect the change. (drgn already pokes at many kernel internals and
needs updates for most kernel versions anyways. We have a big test suite
to catch changes we care about.)

I'm not at all married to (or proud of) this particular implementation;
I'd be happy to use anything that lets me detect the format version in
both cases mentioned in the commit message (ELF file or core dump +
symbol table).

It'd be great if we could get a solution in before 6.4 is released. I
would've reported this sooner, but I just got back from paternity leave
last week.

Thanks!
Omar

arch/x86/include/asm/orc_header.h | 14 ++++++++++++++
arch/x86/include/asm/orc_types.h | 14 ++++++++++++++
arch/x86/kernel/unwind_orc.c | 3 +++
include/asm-generic/vmlinux.lds.h | 3 +++
scripts/mod/modpost.c | 5 +++++
tools/arch/x86/include/asm/orc_types.h | 14 ++++++++++++++
6 files changed, 53 insertions(+)
create mode 100644 arch/x86/include/asm/orc_header.h

diff --git a/arch/x86/include/asm/orc_header.h b/arch/x86/include/asm/orc_header.h
new file mode 100644
index 000000000000..08c3710311f7
--- /dev/null
+++ b/arch/x86/include/asm/orc_header.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#ifndef _ORC_HEADER_H
+#define _ORC_HEADER_H
+
+#include <asm/orc_types.h>
+
+/* For now, the header is just the 4-byte version number. */
+#define ORC_HEADER \
+ __used __section(".orc_header") \
+ static const u32 orc_header = ORC_VERSION
+
+#endif /* _ORC_HEADER_H */
diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h
index 46d7e06763c9..fe1669284254 100644
--- a/arch/x86/include/asm/orc_types.h
+++ b/arch/x86/include/asm/orc_types.h
@@ -9,6 +9,20 @@
#include <linux/types.h>
#include <linux/compiler.h>

+/*
+ * Bump this whenever the format of .orc_unwind, .orc_unwind_ip, or .orc_lookup
+ * changes.
+ *
+ * - Version 2:
+ * - Added struct orc_entry.signal
+ * - Version 3:
+ * - Removed struct orc_entry.end
+ * - Made struct orc_entry.type 3 bits
+ * - Added ORC_TYPE_UNDEFINED and ORC_TYPE_END_OF_STACK
+ * - Renumbered ORC_TYPE_CALL, ORC_TYPE_REGS, and ORC_TYPE_REGS_PARTIAL
+ */
+#define ORC_VERSION 3
+
/*
* The ORC_REG_* registers are base registers which are used to find other
* registers on the stack.
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 3ac50b7298d1..4d8e518365f4 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -7,6 +7,9 @@
#include <asm/unwind.h>
#include <asm/orc_types.h>
#include <asm/orc_lookup.h>
+#include <asm/orc_header.h>
+
+ORC_HEADER;

#define orc_warn(fmt, ...) \
printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index cebdf1ca415d..da9e5629ea43 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -839,6 +839,9 @@

#ifdef CONFIG_UNWINDER_ORC
#define ORC_UNWIND_TABLE \
+ .orc_header : AT(ADDR(.orc_header) - LOAD_OFFSET) { \
+ BOUNDED_SECTION_BY(.orc_header, _orc_header) \
+ } \
. = ALIGN(4); \
.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) { \
BOUNDED_SECTION_BY(.orc_unwind_ip, _orc_unwind_ip) \
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index d4531d09984d..c12150f96b88 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1979,6 +1979,11 @@ static void add_header(struct buffer *b, struct module *mod)
buf_printf(b, "#include <linux/vermagic.h>\n");
buf_printf(b, "#include <linux/compiler.h>\n");
buf_printf(b, "\n");
+ buf_printf(b, "#ifdef CONFIG_UNWINDER_ORC\n");
+ buf_printf(b, "#include <asm/orc_header.h>\n");
+ buf_printf(b, "ORC_HEADER;\n");
+ buf_printf(b, "#endif\n");
+ buf_printf(b, "\n");
buf_printf(b, "BUILD_SALT;\n");
buf_printf(b, "BUILD_LTO_INFO;\n");
buf_printf(b, "\n");
diff --git a/tools/arch/x86/include/asm/orc_types.h b/tools/arch/x86/include/asm/orc_types.h
index 46d7e06763c9..fe1669284254 100644
--- a/tools/arch/x86/include/asm/orc_types.h
+++ b/tools/arch/x86/include/asm/orc_types.h
@@ -9,6 +9,20 @@
#include <linux/types.h>
#include <linux/compiler.h>

+/*
+ * Bump this whenever the format of .orc_unwind, .orc_unwind_ip, or .orc_lookup
+ * changes.
+ *
+ * - Version 2:
+ * - Added struct orc_entry.signal
+ * - Version 3:
+ * - Removed struct orc_entry.end
+ * - Made struct orc_entry.type 3 bits
+ * - Added ORC_TYPE_UNDEFINED and ORC_TYPE_END_OF_STACK
+ * - Renumbered ORC_TYPE_CALL, ORC_TYPE_REGS, and ORC_TYPE_REGS_PARTIAL
+ */
+#define ORC_VERSION 3
+
/*
* The ORC_REG_* registers are base registers which are used to find other
* registers on the stack.
--
2.40.1