Re: xgetbv nondeterminism

From: Andy Lutomirski
Date: Sat Jun 17 2017 - 12:32:54 EST


On Sat, Jun 17, 2017 at 5:51 AM, H.J. Lu <hjl.tools@xxxxxxxxx> wrote:
> On Fri, Jun 16, 2017 at 11:21 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>>
>>>> In any event, I still don't understand the issue. The code does this,
>>>> effectively:
>>>>
>>>> PLT -> GOT
>>>> GOT points to a stub that transfers control to ld.so
>>>> ld.so resolves the symbol (_dl_fixup, I think)
>>>> ld.so patches the GOT
>>>> ld.so jumps to the resolved function
>>>>
>>>> As far as I can tell, the only part of the whole process that might
>>>> touch vector registers at all is elf_ifunc_invoke(). Couldn't all the
>>>> register saving and restoring be moved to elf_ifunc_invoke()?
>>>
>>> Please grep for FOREIGN_CALL the elf directory.
>>
>> I grepped FOREIGN_CALL. It has no explanation whatsoever and appears
>> to unconditionally do nothing in the current glibc version.
>>
>> In f3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e^, in pseudocode, it does:
>>
>> __thread bool must_save;
>>
>> RTLD_CHECK_FOREIGN_CALL: return must_save;
>>
>> RTLD_ENABLE_FOREIGN_CALL: old_must_save = must_save; must_save = true;
>>
>> RTLD_PREPARE_FOREIGN_CALL: save_state(); must_save = false;
>>
>> RTLD_FINALIZE_FOREIGN_CALL: if (must_save) restore(); must_save = old_must_save;
>>
>> save_state() and restore_state() operate on TLS buffers.
>>
>> In summary: this is not async-signal-safe. It's also really messy --
>> there are macros that declare local variables, and the logic isn't
>> apparent without really digging in to all the code.
>>
>> I still don't see why this couldn't be:
>>
>> static void elf_do_foreign_stuff(args here)
>> {
>> void *buf = alloca(state_size);
>> xsaveopt(buf); /* or open-code it if you prefer */
>> call_the_ifunc();
>> xrstor(buf);
>> }
>
> As you have found out that it doesn't work this way since
>
> RTLD_PREPARE_FOREIGN_CALL
>
> and
>
> RTLD_FINALIZE_FOREIGN_CALL
>
> are used in 2 DIFFERENT files.
>

That's ought to be fixable, either by rearranging code or by doing
something like:

RTLD_INIT_FOREIGN_CALL(foreign_call_state);

_dl_whatever_helper(&foreign_call_state);

RTLD_FINALIZE_FOREIGN_CALL(foreign_call_state);

_dl_whatever_helper would do
RTLD_PREPARE_FOREIGN_CALL(ptr_to_foreign_call_state);

renaming these macros a bit might help, too.

>> If there's more than just the iifunc (malloc? profiling? printf?)
>> then all of that could be wrapped as well.
>
> It has nothing to do with ifunc.

What's it for, then? I don't understand why, in a sensible ld.so
architecture, there would ever be a call out from ld.so during runtime
binding to anything other than an ifunc, but I realize that glibc is
weird and ld.so might call out to libc.so for some reason. It doesn't
really matter, though.

>
>> All this stuff comes from:
>>
>> commit b48a267b8fbb885191a04cffdb4050a4d4c8a20b
>> Author: Ulrich Drepper <drepper@xxxxxxxxxx>
>> Date: Wed Jul 29 08:33:03 2009 -0700
>>
>> Preserve SSE registers in runtime relocations on x86-64.
>>
>> SSE registers are used for passing parameters and must be preserved
>> in runtime relocations. This is inside ld.so enforced through the
>> tests in tst-xmmymm.sh. But the malloc routines used after startup
>> come from libc.so and can be arbitrarily complex. It's overkill
>> to save the SSE registers all the time because of that. These calls
>> are rare. Instead we save them on demand. The new infrastructure
>> put in place in this patch makes this possible and efficient.
>>
>> While I think that the control flow is a giant mess and the use of TLS
>
> Yes.
>
>> was a mistake, I think Uli had the right idea: explicitly save the
>
> Yes.
>
>> extended state only when needed.
>>
>
> Only its implementation lead to race condition.

I'm suggesting that the races could be fixed without making the
save/restore unconditional.