[tip:x86/asm] x86/entry/64: Initialize the top of the IRQ stack before switching stacks

From: tip-bot for Andy Lutomirski
Date: Tue Jul 18 2017 - 06:47:08 EST


Commit-ID: 2995590964da93e1fd9a91550f9c9d9fab28f160
Gitweb: http://git.kernel.org/tip/2995590964da93e1fd9a91550f9c9d9fab28f160
Author: Andy Lutomirski <luto@xxxxxxxxxx>
AuthorDate: Tue, 11 Jul 2017 10:33:39 -0500
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Tue, 18 Jul 2017 10:56:23 +0200

x86/entry/64: Initialize the top of the IRQ stack before switching stacks

The OOPS unwinder wants the word at the top of the IRQ stack to
point back to the previous stack at all times when the IRQ stack
is in use. There's currently a one-instruction window in ENTER_IRQ_STACK
during which this isn't the case. Fix it by writing the old RSP to the
top of the IRQ stack before jumping.

This currently writes the pointer to the stack twice, which is a bit
ugly. We could get rid of this by replacing irq_stack_ptr with
irq_stack_ptr_minus_eight (better name welcome). OTOH, there may be
all kinds of odd microarchitectural considerations in play that
affect performance by a few cycles here.

Reported-by: Mike Galbraith <efault@xxxxxx>
Reported-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Jiri Slaby <jslaby@xxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: live-patching@xxxxxxxxxxxxxxx
Link: http://lkml.kernel.org/r/aae7e79e49914808440ad5310ace138ced2179ca.1499786555.git.jpoimboe@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
arch/x86/entry/entry_64.S | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0d4483a..b56f7f2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -469,6 +469,7 @@ END(irq_entries_start)
DEBUG_ENTRY_ASSERT_IRQS_OFF
movq %rsp, \old_rsp
incl PER_CPU_VAR(irq_count)
+ jnz .Lirq_stack_push_old_rsp_\@

/*
* Right now, if we just incremented irq_count to zero, we've
@@ -478,9 +479,30 @@ END(irq_entries_start)
* it must be *extremely* careful to limit its stack usage. This
* could include kprobes and a hypothetical future IST-less #DB
* handler.
+ *
+ * The OOPS unwinder relies on the word at the top of the IRQ
+ * stack linking back to the previous RSP for the entire time we're
+ * on the IRQ stack. For this to work reliably, we need to write
+ * it before we actually move ourselves to the IRQ stack.
+ */
+
+ movq \old_rsp, PER_CPU_VAR(irq_stack_union + IRQ_STACK_SIZE - 8)
+ movq PER_CPU_VAR(irq_stack_ptr), %rsp
+
+#ifdef CONFIG_DEBUG_ENTRY
+ /*
+ * If the first movq above becomes wrong due to IRQ stack layout
+ * changes, the only way we'll notice is if we try to unwind right
+ * here. Assert that we set up the stack right to catch this type
+ * of bug quickly.
*/
+ cmpq -8(%rsp), \old_rsp
+ je .Lirq_stack_okay\@
+ ud2
+ .Lirq_stack_okay\@:
+#endif

- cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
+.Lirq_stack_push_old_rsp_\@:
pushq \old_rsp
.endm