Re: 2.6.12-rc2-mm3

From: Andrew Morton
Date: Tue Apr 12 2005 - 01:28:33 EST


Stas Sergeev <stsp@xxxxxxxx> wrote:
>
> Hello.
>
> Andrew Morton wrote:
> >> Program received signal SIGTRAP, Trace/breakpoint trap.
> SIGTRAP - it looks like the "int $3"
> triggered, not "mov 0x30(%esp),%eax",
> which is just the next insn and so the
> %eip points to it, but it might be
> innocent. And besides, 0x30(%esp) is
> EFLAGS, not OLDSS. So I think maybe my
> patch is not guilty this time, it is
> just the non-zero preempt count on the
> return path caused by something else.

OK, the `int $3' is part of the CONFIG_TRAP_BAD_SYSCALL_EXITS thing which I
never use.

I'm not sure what problem is actually being reported here, now you mention it.

> >> (gdb) p $eip
> >> $1 = (void *) 0xc0102ee7
> Could you please also do
> "p $esp" or "info reg", so that we can
> see the rest of the registers?
>
> >> And as we see, we're at the "mov 0x30(%esp),%eax" which accesses above the
> >> bottom of the stack.
> But that's strange. Another instance of
> the 0x30(%esp) is there a few instructions
> above this one, see it with "disas restore_all".
> It is much more likely that the real offender
> is the previous instruction. $eip points on
> the instruction *after* the trap, which might
> be innocent.

Yup. But are you sure that the "+ 8" is correct, given these offsets are
larger than that?

> >> After applying nmi_stack_correct-fix.patch, rc2-mm3
> I can't find this one in an -mm broken-outs.

It was in rc2-mm2.

> Where is this patch?
> Could you please also test this one:
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0504.0/1287.html
>
> > Interesting. It could be an interaction between the kgdb patch and the new
> > vm86 checking code.
> I think so too, will have a look if I can
> reproduce it.
>
> > The above code is accessing esp+56,
> Yes, but this particular instruction was
> not reached. "int $3" killed the system
> for some reasons.

Probably it decided that some syscall got a "bad exit". Disable
CONFIG_TRAP_BAD_SYSCALL_EXITS.

> > - p->thread.esp0 = (unsigned long) (childregs+1) - 8;
> > + p->thread.esp0 = (unsigned long) (childregs+1) - 15;
> 15 is somewhat nasty - it will make the
> stack unaligned, should better be 16 I
> think.

? It's still 4-byte-aligned.

> But I don't see why, the only
> scenario we've seen were the not stored
> SS/ESP, which is 8 bytes only.
> But the
> If we definitely think my patch is guilty
> again, then probably something like this
> is necessary:
>
> --- linux/include/asm-i386/processor.h.old 2005-03-20 14:13:02.000000000 +0300
> +++ linux/include/asm-i386/processor.h 2005-04-12 07:50:11.000000000 +0400
> @@ -458,7 +458,7 @@
> * be within the limit.
> */
> #define INIT_TSS { \
> - .esp0 = sizeof(init_stack) + (long)&init_stack, \
> + .esp0 = sizeof(init_stack) - 8 + (long)&init_stack, \
> .ss0 = __KERNEL_DS, \
> .ss1 = __KERNEL_CS, \
> .ldt = GDT_ENTRY_LDT, \
>
> But I don't think the init_stack can be
> abused on the sysenter path, so this is
> just a wild guess.

I'm suspecting this is all due to CONFIG_TRAP_BAD_SYSCALL_EXITS taking the
debug trap..

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/