Re: [RFC PATCH 2/3] rseq: extend struct rseq with per thread group vcpu id

From: Mathieu Desnoyers
Date: Tue Feb 01 2022 - 16:20:44 EST


----- On Feb 1, 2022, at 3:32 PM, Florian Weimer fw@xxxxxxxxxxxxx wrote:
[...]
>
>>> Is the switch really useful? I suspect it's faster to just write as
>>> much as possible all the time. The switch should be well-predictable
>>> if running uniform userspace, but still …
>>
>> The switch ensures the kernel don't try to write to a memory area beyond
>> the rseq size which has been registered by user-space. So it seems to be
>> useful to ensure we don't corrupt user-space memory. Or am I missing your
>> point ?
>
> Due to the alignment, I think you'd only ever see 32 and 64 bytes for
> now?

Yes, but I would expect the rseq registration arguments to have a rseq_len
of offsetofend(struct rseq, tg_vcpu_id) when userspace wants the tg_vcpu_id
feature to be supported (but not the following features).

Then, as we append additional features as follow-up fields, those
eventually become requested by glibc by increasing the requested size.

Then it's kind of weird to receive a registration size which is not
aligned on 32-byte, but then use internal knowledge of the structure
alignment in the kernel code to write beyond the requested size. And all
this in a case where we are returning to user-space after a preemption,
so I don't expect this extra switch/case to cause significant overhead.

>
> I'd appreciate if you could put the maximm supported size and possibly
> the alignment in the auxiliary vector, so that we don't have to rseq
> system calls in a loop on process startup.

Yes, it's a good idea. I'm not too familiar with the auxiliary vector.
Are we talking about the kernel's

fs/binfmt_elf.c:fill_auxv_note()

?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com