Re: [PATCH] riscv: Define TASK_SIZE_MAX for __access_ok()

From: Mark Rutland
Date: Mon Mar 25 2024 - 13:37:57 EST


On Mon, Mar 25, 2024 at 08:30:37AM +0100, Alexandre Ghiti wrote:
> Hi David,
>
> On 24/03/2024 20:42, David Laight wrote:
> > ...
> > > The use of alternatives allows to return right away if the buffer is
> > > beyond the usable user address space, and it's not just "slightly
> > > faster" for some cases (a very large buffer with only a few bytes being
> > > beyond the limit or someone could fault-in all the user pages and fail
> > > very late...etc). access_ok() is here to guarantee that such situations
> > > don't happen, so actually it makes more sense to use an alternative to
> > > avoid that.
> > Is it really worth doing ANY optimisations for the -EFAULT path?
> > They really don't happen.
> >
> > The only fault path that matters is the one that has to page in
> > data from somewhere.
>
> Which is completely avoided with a strict definition of access_ok(). I see
> access_ok() as an already existing optimization of fault paths by avoiding
> them entirely when they are bound to happen.

I think the point that David is making is that address+size pairs that'd fail
access_ok() *should* be rare, and hence it's a better trade-off to occasionally
handle faults for those if it makes the common case of successful access_ok()
smaller or faster. For any well-behaved userspace applications, access_ok()
should practically never fail, since userspace should be passing good
address+size pairs as arguments to syscalls.

Using a compile-time constant TASK_SIZE_MAX allows the compiler to generate
much better code for access_ok(), and on arm64 we use a compile-time constant
even when our page table depth can change at runtime (and when native/compat
task sizes differ). The only abosolute boundary that needs to be maintained is
that access_ok() fails for kernel addresses.

Mark.