Re: [uml-devel] SYSCALL, ptrace and syscall restart breakages (Re:[RFC] weird crap with vdso on uml/i386)

From: Andrew Lutomirski
Date: Tue Aug 23 2011 - 12:12:10 EST


On Tue, Aug 23, 2011 at 12:03 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Aug 22, 2011 at 11:15 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> * it does SETREGS, setting eax to return value, eip to original return
>> address of syscall insn... and ebp to what it had in regs.bp.  I.e. the
>> damn arg6 value.
>
> Ok, I think that exhaustively explains that
>
>  (a) our system call restart has always worked correctly, and we're good.
>
>  (b) it's simply just UML that is buggy, and doesn't understand the
> subtleties about doing GETREGS at a system call.
>
> and I think that the correct and simple solution is to just teach UML
> to understand the proper logic of pt_regs during a system call (and a
> 'syscall' instruction in particular).
>
> The thing is, UML emulates 'syscall' the way the *CPU* does it, not
> the way *we* do it. That may make sense, but it's simply not correct.
>
> So I would vote very strongly against actually changing anything in
> arch/x86. This is very much an UML issue.
>
> Suggested fixes:
>
>  - instead of blindly doing SETREGS, just write the result registers
> individually like you suggested
>
> OR (and perhaps preferably):
>
>  - teach UML that when you do 'GETREGS' after a system call trapped,
> we have switched things around to match the "official system call
> order", and UML should just undo our swizzling, and do a "regs.ebp =
> regs.ecx" to make it be what the actual original registers were (or
> whatever the actual correct swizzle is - I didn't think that through
> very much).
>
> IOW, I think the core kernel does the right thing. Our argument
> register swizzling is odd, yes, but it's an implementation detail that
> something like uml should just have to take into account. No?
>
> Hmm?

Egads. Does this mean that doing GETREGS and then doing SETREGS later
on with the *exact same values* is considered incorrect? IMO, this
way lies madness.

In any case, this seems insanely overcomplicated. I'd be less afraid
of something like my approach (which, I think, makes all of the
SYSCALL weirdness pretty much transparent to ptrace users) or of just
removing SYSCALL entirely from 32-bit code.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/