Re: [PATCH 1/4] arm64/sve: Remove bitrotted comment about syscall behaviour

From: Dave Martin
Date: Tue Jan 23 2024 - 12:54:45 EST


On Tue, Jan 23, 2024 at 05:31:58PM +0000, Mark Brown wrote:
> On Tue, Jan 23, 2024 at 03:44:23PM +0000, Dave Martin wrote:
> > On Mon, Jan 22, 2024 at 08:41:51PM +0000, Mark Brown wrote:
> > > When we documented that we always clear state not shared with FPSIMD we
>
> > Where / when?
>
> In the document that is being modified when it was written.

Ah, right, I see this:

d09ee410a3c3 ("arm64/sve: Document our actual ABI for clearing registers on syscall")

where the zeroing is made explicit.

>
> > > -* In practice the affected registers/bits will be preserved or will be replaced
> > > - with zeros on return from a syscall, but userspace should not make
> > > - assumptions about this. The kernel behaviour may vary on a case-by-case
> > > - basis.
>
> > This was originally an intentionally conservative statement, to allow
> > the kernel the flexibility to relax the register zeroing behaviour in
> > the future. It would have permitted not always disabling a task's SVE
> > across a syscall, for example. There were some concerns about security
> > and testability that meant that we didn't use this flexibility to begin
> > with.
>
> > If we are making an irrevocable commitment not to use this flexibility
> > ever, then this comment can go, but if we're not totally sure then I
> > think it would be harmless to keep it (?)
>
> I think everyone except for Catalin had felt that the original
> discussion had concluded that there was a commitment to always clear the
> non-shared bits and was disappointed to learn that the documentation
> said otherwise. When I tried to take advantage of this as part of
> optimising the system call overhead for SVE there were eventually
> complaints.
>
> > (Feel free to point me to the relevant past discussion that I may have
> > missed.)
>
> See the discussion on my syscall optimisation series:
>
> https://lore.kernel.org/all/20220620124158.482039-8-broonie@xxxxxxxxxx/


I think my excuse would be that this was consciously left unresolved
when SVE originally went upstream: the kernel played safe by always
zeroing the bits, while userspace was told not to rely on this always
happening in future.

If the decision has effectively now been made to close the door
permanently those optimisations, then I guess it makes sense to clean
up the documentation to be as consistent as possible.


I still feel that it is iffy practice for userspace to rely on the
extra bits being zeroed -- I think the architecture hides this
guarantee anyway whenever you go through a function call confirming to
the regular procedure call standard (including the syscall wrappers).
But there may not be a lot of point trying to put people off if we
can't force them not to rely on it.

Cheers
---Dave