Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set

From: Willy Tarreau
Date: Wed Jan 10 2018 - 04:12:22 EST


On Wed, Jan 10, 2018 at 09:22:07AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 09, 2018 at 01:56:20PM +0100, Willy Tarreau wrote:
> > - use pti_disable instead of task flag
> > ---
> > arch/x86/entry/calling.h | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
> > index 2c0d3b5..5361a10 100644
> > --- a/arch/x86/entry/calling.h
> > +++ b/arch/x86/entry/calling.h
> > @@ -229,6 +229,11 @@
> >
> > .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
> > ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
> > +
> > + /* The "pti_disable" mm attribute is mirrored into this per-cpu var */
> > + cmpb $0, PER_CPU_VAR(pti_disable)
> > + jne .Lend_\@
> > +
> > mov %cr3, \scratch_reg
>
> So could you switch back to a task flag for this? That word is already
> cache-hot on the exit path while your new variable is not.

That's a good point. There's already been some demands for a per-thread
setting.

What I can propose then is to partially revert the changes to have this :

- arch_prctl() adjusts the task flag and not a per-mm variable anymore
(Linus, are you OK for this ?)

- arch_prctl() only accepts to perform the action if mm->mm_users == 1
so that we don't change the setting after having created threads ;
this way the task flag is replicated to all future threads ;

- later we may decide to permit re-enabling PTI per thread if it was
disabled.

If we agree on this, I'd like to propose to have two flags :

- TIF_DISABLE_PTI_NOW : disable PTI for the current task, reset by execve()
- TIF_DISABLE_PTI_NEXT : disable PTI after execve(), reset by execve()

execve() would then simply do :

TIF_DISABLE_PTI_NOW = TIF_DISABLE_PTI_NEXT;
TIF_DISABLE_PTI_NEXT = 0;

The former would be used by applications using their own configuration.
The latter would be used by wrappers. This way we seem to cover the various
use cases. And we make this depend on a sysctl that allows the admin to
globally and permanently disable the feature and which is disabled by
default.

Any objection ?

Regards,
Willy