Re: [RFC PATCH for 4.21 01/16] rseq/selftests: Add reference counter to coexist with glibc

From: Szabolcs Nagy
Date: Thu Oct 11 2018 - 12:23:07 EST


On 11/10/18 16:13, Mathieu Desnoyers wrote:
> ----- On Oct 11, 2018, at 6:37 AM, Szabolcs Nagy Szabolcs.Nagy@xxxxxxx wrote:
>
>> On 10/10/18 20:19, Mathieu Desnoyers wrote:
>>> In order to integrate rseq into user-space applications, add a reference
>>> counter field after the struct rseq TLS ABI so many rseq users can be
>>> linked into the same application (e.g. librseq and glibc). The
>>> reference count ensures that rseq syscall registration/unregistration
>>> happens only for the most early/late user for each thread, thus ensuring
>>> that rseq is registered across the lifetime of all rseq users for a
>>> given thread.
>> ...
>>> +__attribute__((visibility("hidden"))) __thread
>>> +volatile struct libc_rseq __lib_rseq_abi = {
>> ...
>>> +extern __attribute__((weak, alias("__lib_rseq_abi"))) __thread
>>> +volatile struct rseq __rseq_abi;
>> ...
>>> @@ -70,7 +86,7 @@ int rseq_register_current_thread(void)
>>> sigset_t oldset;
>>>
>>> signal_off_save(&oldset);
>>> - if (refcount++)
>>> + if (__lib_rseq_abi.refcount++)
>>> goto end;
>>> rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG);
>>
>> why do you use a local refcounter instead of the __rseq_abi one?
>
> There is no refcount in struct rseq (the ABI between kernel and user-space).
> The registration refcount was part of an earlier version of the rseq system call,
> but we decided against keeping it in the kernel.
>
> So I'm adding one _after_ struct rseq, purely to allow interaction between
> various user-space components (program/libraries).

then all those components must use the same

rseq_register_current_thread
rseq_unregister_current_thread

functions and not call the syscall on their own.

in which case the refcount could be a static __thread variable.

but it's in a magic struct that's called "abi" which is confusing,
the counter is not abi, it's in a hidden object.

>> what prevents calling rseq_register_current_thread more than 4G times?
>
> Nothing. It would indeed be cleaner to error out if we detect that refcount is at
> INT_MAX. Is that what you have in mind ?

yes

>> why cant the kernel see that the same address is registered again and succeed?
>
> It can, and it does. However, refcounting at user-level is needed to ensure
> the registration "lifetime" for rseq covers its entire use. If we have two libraries
> using rseq, we end up with the following scenario:
>
> Thread 1
>
> libA registers rseq
> libB registers rseq
> libB unregisters rseq
> libA uses rseq -> bug! it's been unregistered by libB.
> libA unregisters rseq -> unexpected, it's already been unregistered.
>
> same applies if libA unregisters rseq before libB (and libB try to use rseq
> after libA has unregistered).
>
> The refcount in user-space fixes this.

i see.

> Thoughts ?
>
> Thanks,
>
> Mathieu
>