Re: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier

From: Dan Williams
Date: Thu Jan 04 2018 - 17:10:00 EST


On Thu, Jan 4, 2018 at 3:47 AM, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> Hi Dan,
>
> Thanks for these examples.
>
> On Thu, Jan 04, 2018 at 03:10:51AM +0000, Williams, Dan J wrote:
>> Note, the following is Elena's work, I'm just helping poke the upstream
>> discussion along while she's offline.
>>
>> Elena audited the static analysis reports down to the following
>> locations where userspace provides a value for a conditional branch and
>> then later creates a dependent load on that same userspace controlled
>> value.
>>
>> 'osb()', observable memory barrier, resolves to an lfence on x86. My
>> concern with these changes is that it is not clear what content within
>> the branch block is of concern. Peter's 'if_nospec' proposal combined
>> with Mark's 'nospec_load' would seem to clear that up, from a self
>> documenting code perspective, and let archs optionally protect entire
>> conditional blocks or individual loads within those blocks.
>
> I'm a little concerned by having to use two helpers that need to be paired. It
> means that we have to duplicate the bounds information, and I suspect in
> practice that they'll often be paired improperly.

Hmm, will they be often mispaired? All of the examples to date the
load occurs in the same code block as the bound checking if()
statement.

> I think that we can avoid those problems if we use nospec_ptr() on its own. It
> returns NULL if the pointer would be out-of-bounds, so we can use it in the
> if-statement.
>
> On x86, that can contain the bounds checks followed be an OSB / lfence, like
> if_nospec(). On arm we can use our architected sequence. In either case,
> nospec_ptr() returns NULL if the pointer would be out-of-bounds.
>
> Then, rather than sequences like:
>
> if_nospec(idx < max) {
> val = nospec_array_load(array, idx, max);
> ...
> }
>
> ... we could have:
>
> if ((elem_ptr = nospec_array_ptr(array, idx, max)) {
> val = *elem_ptr;
> ...
> }
>
> ... which also means we don't have to annotate every single load in the branch
> if the element is a structure with multiple fields.

We wouldn't need to annotate each load in that case, right? Just the
instance that uses idx to calculate an address.

if_nospec (idx < max_idx) {
elem_ptr = nospec_array_ptr(array, idx, max);
val = elem_ptr->val;
val2 = elem_ptr->val2;
}

> Does that sound workable to you?

Relative to the urgency of getting fixes upstream I'm ok with whatever
approach generates the least debate from sub-system maintainers.

One concern is that on x86 the:

if ((elem_ptr = nospec_array_ptr(array, idx, max)) {

...approach produces two conditional branches whereas:

if_nospec (idx < max_idx) {
elem_ptr = nospec_array_ptr(array, idx, max);

....can be optimized to one conditional with the barrier.

Is that convincing enough to proceed with if_nospec + nospec_* combination?