Re: tip -ENOBOOT - bisected to locking/refcounts, x86/asm: Implement fast refcount overflow protection

From: Kees Cook
Date: Thu Aug 31 2017 - 00:10:45 EST


On Wed, Aug 30, 2017 at 9:01 PM, Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> On Wed, Aug 30, 2017 at 8:12 PM, Mike Galbraith <efault@xxxxxx> wrote:
>> On Wed, 2017-08-30 at 19:27 -0700, Kees Cook wrote:
>>
>>> Interesting! Can you try with 633547973ffc3 ("net: convert
>>> sk_buff.users from atomic_t to refcount_t") reverted? I'll see if
>>> running haveged will help me trigger this on my system...
>>
>> With that (plus 230cd1279d001 fix to it) reverted, vbox boots.
>
> Wonderful! Thank you so much for helping track this down.
>
> So, it seems that sk_buff.users will need some more special attention
> before we can convert it to refcount.
>
> x86-refcount will saturate with refcount_dec_and_test() if the result
> is negative. But that would mean at least starting at 0. FULL should
> have WARNed in this case, so I remain slightly confused why it was
> missed by FULL.

Actually, if this is a race condition it's possible that FULL is slow
enough to miss it...

I bet something briefly takes the refcount negative, and with
unchecked atomics, it come back up positive again during the race.
FULL may miss the race, and x86-refcount will catch it and saturate...

-Kees

--
Kees Cook
Pixel Security