Re: crash in entry.S restore_all, 2.6.12-rc2, x86, PAGEALLOC

From: Stas Sergeev
Date: Mon Apr 11 2005 - 12:18:07 EST


Hello.

Andrew Morton wrote:
This is utterly obscure - it needs a comment so that readers know what that
"- 8" is doing there.
Yes, that was only an RFC thing.
And now since there were not too much
of an FC, I prepared the "polished"
version. But apparently you already
released -mm3:)

Well, at least you can still apply the
comments if you feel like that. Here
they are.

Signed-off-by: Stas Sergeev <stsp@xxxxxxxx>

--- linux-2.6.12-rc2/arch/i386/kernel/entry.S 2005-04-06 09:34:35.000000000 +0400
+++ linux/arch/i386/kernel/entry.S 2005-04-11 10:49:28.000000000 +0400
@@ -245,6 +245,9 @@

restore_all:
movl EFLAGS(%esp), %eax # mix EFLAGS, SS and CS
+ # Warning: OLDSS(%esp) contains the wrong/random values if we
+ # are returning to the kernel.
+ # See comments in process.c:copy_thread() for details.
movb OLDSS(%esp), %ah
movb CS(%esp), %al
andl $(VM_MASK | (4 << 8) | 3), %eax
--- linux-2.6.12-rc2/arch/i386/kernel/process.c 2005-04-06 09:34:35.000000000 +0400
+++ linux/arch/i386/kernel/process.c 2005-04-11 10:30:39.000000000 +0400
@@ -394,6 +394,16 @@
childregs->esp = esp;

p->thread.esp = (unsigned long) childregs;
+ /*
+ * The below -8 is to reserve 8 bytes on top of the ring0 stack.
+ * This is necessary to guarantee that the entire "struct pt_regs"
+ * is accessable even if the CPU haven't stored the SS/ESP registers
+ * on the stack (interrupt gate does not save these registers
+ * when switching to the same priv ring).
+ * Therefore beware: accessing the xss/esp fields of the
+ * "struct pt_regs" is possible, but they may contain the
+ * completely wrong values.
+ */
p->thread.esp0 = (unsigned long) (childregs+1) - 8;

p->thread.eip = (unsigned long) ret_from_fork;