Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO

From: Andy Lutomirski
Date: Thu Sep 13 2018 - 15:36:01 EST




> On Sep 13, 2018, at 12:07 PM, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
>
> On 09/13/2018 05:22 PM, Andy Lutomirski wrote:
>>> On Sep 13, 2018, at 1:07 AM, Florian Weimer <fweimer@xxxxxxxxxx> wrote:
>>>
>>> On 09/12/2018 07:11 PM, Andy Lutomirski wrote:
>>>>> The multiplexer interfaces need much more surgery and talking about futex,
>>>>> we'd need to sit down with quite some people and identify the things they
>>>>> actually care about before just splitting it up and keeping the existing
>>>>> overloaded trainwreck the same.
>>>>>
>>>> Thereâs also the issue of how much the speedup matters. For futex, maybe a better interface saves 3ns, but a futex syscall is hundreds of ns. clock_gettime() is called at high frequency and can be ~25ns. Saving a few ns is a bigger deal.
>>>
>>> My concern is that the userspace system call wrappers currently do not know how many arguments the individual operations take and what types the arguments have (hence the âtype-polymorphicâ nature I mentioned). This could be a problem for on-stack argument passing (where you might read values beyond the end of the stack, and glibc avoids that most of the time by having enough cruft on the stack), and for architectures which pass pointers and integers in different registers (like some m68k ABIs do for the return value).
>
>> Isnât clock_gettime already special because of the vDSO entry point, though?
>
> Somewhat special, yes, but not overly so, and not in the type-polymorphic sense. We can't give direct access of the vDSO implementation to applications because the kernel does not know about the userspace errno variable. We do that for time on x86_64, where applications call into the vDSO directly, bypassing glibc completely after binding.

If the vDSO adds special helpers for CLOCK_MONOTONIC and CLOCK_REALTIME, I think we can reasonably safely promise that they never fail. (seccomp can obviously break that promise if thereâs no TSC, but I think that seccomp users who do that get to keep both pieces.)