Re: Dynamic nop selection breaks boot on Geode LX

From: H. Peter Anvin
Date: Mon Oct 04 2010 - 18:23:27 EST


On 10/04/2010 03:15 PM, Steven Rostedt wrote:
>>
>> We tried exactly this type of dynamic selection before, and it doesn't
>> work on broken virtualizers; in particular Microsoft VirtualPC can pass
>> the exception test and yet fail later.
>
> So the code is broken because of broken virtualizers??
>

Yup. Fun, isn't it? :( Unfortunately, broken virtualizers appear as
broken CPUs to us. We used to do the #UD probe for NOPL, but it didn't
work.

>>
>> The end result is very simple: you can always use NOPL on 64 bits, you
>> can never use NOPL on 32 bits.
>>
>> 66 66 66 66 90 will always *work* (as in, it will never fail) but it's
>> pretty slow on older CPUs which took a hit on handle prefixes -- but it
>> might still be faster than a jump on those. Thus, in your code the JMP
>> case will never be reached anyway.
>
> The jmp was there because of paranoia, and I never expected it to be
> reached.
>
>>
>> There isn't, of course, a classic 5-byte sequence, although the sequence:
>>
>> 2E 8D 75 26 00
>>
>> ... should work (leal %ds:0(,%esi,1),%esi). However, 66 ... 90 is
>> likely to work better on modern processors (although I haven't measured it.)
>
> The point is, this nop will be at _every_ function call (it replaces the
> mcount call). Not just scattered throughout the kernel. It is imperative
> that we have the best nop available.
>
> So what would you recommend?
>

NOPL is special, because it's the only NOP sequence that isn't actually
*supported* on all processors (and we have found that we can't even use
it on 32 bits, even though the vast majority of all real-life 32-bit
processors do support it.)

Borislav is just checking to see if we can just use NOPL unconditionally
on 64 bits; as far as 32 bits is concerned the only option for picking
what is "best" is probably to benchmark some set of sequences on the set
of processors we care about. However, I suspect that on any modern
processors either 66 66 66 66 90 or 2E 8D 75 26 00 will work equally well.

With a bit of benchmarking I think we could adopt the policy of using
NOPL on 64 bits and one of the above sequences on 32 bits.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/