Re: [PATCH V2] x86/sgx: Fix free page accounting

From: Luck, Tony
Date: Wed Nov 10 2021 - 22:26:58 EST


On Thu, Nov 11, 2021 at 04:55:14AM +0200, Jarkko Sakkinen wrote:
> On Wed, 2021-11-10 at 10:51 -0800, Reinette Chatre wrote:
> > sgx_should_reclaim() would only succeed when sgx_nr_free_pages goes
> > below the watermark. Once sgx_nr_free_pages becomes corrupted there is
> > no clear way in which it can correct itself since it is only ever
> > incremented or decremented.
>
> So one scenario would be:
>
> 1. CPU A does a READ of sgx_nr_free_pages.
> 2. CPU B does a READ of sgx_nr_free_pages.
> 3. CPU A does a STORE of sgx_nr_free_pages.
> 4. CPU B does a STORE of sgx_nr_free_pages.
>
> ?
>
> That does corrupt the value, yes, but I don't see anything like this
> in the commit message, so I'll have to check.
>
> I think the commit message is lacking a concurrency scenario, and the
> current transcripts are a bit useless.

What about this part:

With sgx_nr_free_pages accessed and modified from a few places
it is essential to ensure that these accesses are done safely but
this is not the case. sgx_nr_free_pages is read without any
protection and updated with inconsistent protection by any one
of the spin locks associated with the individual NUMA nodes.
For example:

CPU_A CPU_B
----- -----
spin_lock(&nodeA->lock); spin_lock(&nodeB->lock);
... ...
sgx_nr_free_pages--; /* NOT SAFE */ sgx_nr_free_pages--;

spin_unlock(&nodeA->lock); spin_unlock(&nodeB->lock);

Maybe you missed the "NOT SAFE" hidden in the middle of
the picture?

-Tony