Re: [PATCH v2 1/2] x86/entry/32: Fix entry_INT80_32 to expect interrupts to be on

From: Andy Lutomirski
Date: Sat Oct 17 2015 - 00:06:03 EST


On Oct 16, 2015 7:54 PM, "Brian Gerst" <brgerst@xxxxxxxxx> wrote:
>
> On Fri, Oct 16, 2015 at 6:42 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > When I rewrote entry_INT80_32, I thought that int80 was an interrupt
> > gate. It's a trap gate. *facepalm*
> >
> > Thanks to Brian Gerst for pointing out that it's better to change
> > the entry code than to change the gate type.
> >
> > Suggested-by: Brian Gerst <brgerst@xxxxxxxxx>
> > Reported-by: Borislav Petkov <bp@xxxxxxx>
> > Fixes: 150ac78d63af ("x86/entry/32: Switch INT80 to the new C syscall path")
> > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> > ---
> > arch/x86/entry/common.c | 15 ++++++++++++---
> > arch/x86/entry/entry_32.S | 8 ++++----
> > arch/x86/entry/entry_64_compat.S | 2 +-
> > 3 files changed, 17 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> > index b53e04d301a3..09afb3b6acbb 100644
> > --- a/arch/x86/entry/common.c
> > +++ b/arch/x86/entry/common.c
> > @@ -351,7 +351,14 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs)
> > * in workloads that use it, and it's usually called from
> > * do_fast_syscall_32, so forcibly inline it to improve performance.
> > */
> > -static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
> > +#ifdef CONFIG_X86_32
> > +/* 32-bit kernels use a trap gate for int80, and the asm code calls here. */
> > +__visible
> > +#else
> > +/* 64-bit kernels use do_syscall_32_irqs_off instead. */
> > +static
> > +#endif
> > +__always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
> > {
> > struct thread_info *ti = pt_regs_to_thread_info(regs);
> > unsigned int nr = (unsigned int)regs->orig_ax;
> > @@ -386,12 +393,14 @@ static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs)
> > syscall_return_slowpath(regs);
> > }
> >
> > -/* Handles int $0x80 */
> > -__visible void do_int80_syscall_32(struct pt_regs *regs)
> > +#ifdef CONFIG_X86_64
> > +/* Handles int $0x80 on 64-bit kernels */
> > +__visible void do_syscall_32_irqs_off(struct pt_regs *regs)
> > {
> > local_irq_enable();
> > do_syscall_32_irqs_on(regs);
> > }
> > +#endif
>
> This would be more readable if the STI were moved down into the asm
> for 64-bit. In fact, we should be re-enabling interrupts as early as
> possible once the full kernel environment is set up (on the process
> stack, NT clear, and after SWAPGS). What was your reasoning for
> moving it later?

On x86_64, we have to use an interrupt gate because of swapgs, and for
context tracking, once I clean up the SYSCALL64 entry, I want to make
it all the way to user_exit with IRQs off. There are nice
optimizations that become possible once user_exit is always called
with IRQs off, and there's another cleanup we can do when IRQs are no
longer possible in kernel mode with IRQs on.

For x86_32, we don't actually support context tracking, but we could.
So maybe we should suck up the ~3 cycles for int80 users and just use
an interrupt gate everywhere.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/