Re: [RFC] de-asmify the x86-64 system call slowpath

From: Andy Lutomirski
Date: Mon Jan 27 2014 - 17:06:39 EST


On 01/26/2014 11:42 PM, Al Viro wrote:
> On Sun, Jan 26, 2014 at 08:32:09PM -0800, Linus Torvalds wrote:
>> On Sun, Jan 26, 2014 at 4:22 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> Umm... Can't uprobe_notify_resume() modify regs as well?
>>
>> Probably.
>>
>> .. and on the other hand, we should actually be able to use 'sysret'
>> for signal handling on x86-64, because while sysret destroys %rcx and
>> doesn't allow for returning to odd modes, for calling a signal handler
>> I don't think we really care..
>
> I'm afraid we might:
>
> * When user can change the frames always force IRET. That is because
> * it deals with uncanonical addresses better. SYSRET has trouble
> * with them due to bugs in both AMD and Intel CPUs.
>
> IIRC, that was about SYSRET with something unpleasant left in RCX, which
> comes from regs->ip, which is set to sa_handler by __setup_rt_frame().
> And we do not normalize or validate that - not in __setup_rt_frame() and
> not at sigaction(2) time. Something about GPF triggered and buggering
> attacker-chosen memory area? I don't remember details, but IIRC the
> conclusion had been "just don't go there"...
>
> Note that we can manipulate regs->ip and regs->sp, regardless of validation
> at sigaction(2) or __setup_rt_frame() - just have the sucker ptraced, send
> it a signal and it'll stop on delivery. Then tracer can use ptrace to modify
> registers and issue PTRACE_CONT with zero signal. Voila - regs->[is]p
> set to arbitrary values, no signal handlers triggered...
>

It's not just ip and sp -- cs matters here, too, I think.

(I may be the only one to have ever tried it, but it's possible to
far-call from 64-bit to 32-bit cs, and it works. I've never tried
switching cs using ptrace, but someone may want that to work. too.)

That being said, the last time I benchmarked it, sysret was *way* faster
than iret. So maybe the thing to do is to validate the registers on the
way out and, if they're appropriate for sysret, do the sysret.

I'm not quite sure how to express "I don't care about rcx" in pt_regs.
Maybe use the actual value that the CPU will stick in there (assuming
that anyone knows that this is).

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/