Re: [tip:x86/urgent] x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

From: Alexander van Heukelum
Date: Sat Apr 12 2014 - 19:37:50 EST


Hi,

> This is a writeup I did to a select audience before this was public:

I'ld like to add an option d.2. for your consideration. Can you think of a
fundamental problem with it?

Greetings,
Alexander

> > Some workarounds I have considered:
> >
> > a. Using paging in a similar way to the 32-bit segment base workaround
> >
> > This one requires a very large swath of virtual user space (depending on
> > allocation policy, as much as 4 GiB per CPU.) The "per CPU" requirement
> > comes in as locking is not feasible -- as we return to user space there
> > is nowhere to release the lock.
> >
> > b. Return to user space via compatibility mode
> >
> > As the kernel lives above the 4 GiB virtual mark, a transition through
> > compatibility mode is not practical. This would require the kernel to
> > reserve virtual address space below the 4 GiB mark, which may interfere
> > with the application, especially an application launched as a 64-bit
> > application.
> >
> > c. Trampoline in kernel space
> >
> > A trampoline in kernel space is not feasible since all ring transition
> > instructions capable of returning to 16-bit mode require the use of the
> > stack.

"16 bit mode" -> "a mode with 16-bit stack"

> > d. Trampoline in user space
> >
> > A return to the vdso with values set up in registers r8-r15 would enable
> > a trampoline in user space. Unfortunately there is no way
> > to do a far JMP entirely with register state so this would require
> > touching user space memory, possibly in an unsafe manner.

d.2. trampoline in user space via long mode

Return from the kernel to a user space trampoline via long mode.
The kernel changes the stack frame just before executing the iret
instruction. (the CS and RIP slots are set to run the trampoline code,
where CS is a long mode segment.) The trampoline code in userspace
is set up to this single instruction: a far jump to the final CS:EIP
(compatibility mode).

Because the IRET is now returning to long mode, all registers are
restored fully. The stack cannot be used at this point, but the far
jump doesn't need stack and it will/should make the stack valid
immediately after execution. The IRET enables interrupts, so the
far jump is in the interrupt shadow: it won't be seen, unless it causes
an exception.

> > The most likely variant is to use the address of the 16-bit user stack
> > and simply hope that this is a safe thing to do.
> >
> > This appears to be the most feasible workaround if a workaround is
> > deemed necessary.
> >
> > e. Transparently run 16-bit code segments inside a lightweight VMM

"16-bit code" -> "code with 16-bit stack"

> > The complexity of this solution versus the realized value is staggering.
> > It also doesn't work on non-virtualization-capable hardware (including
> > running on top of a VMM which doesn't support nested virtualization.)
> >
> > -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/