Re: [PATCH v1 15/26] x86/sev: Introduce snp leaked pages list

From: Vlastimil Babka
Date: Wed Jan 10 2024 - 04:00:01 EST


On 1/9/24 23:19, Kalra, Ashish wrote:
> Hello Vlastimil,
>
> On 1/8/2024 4:45 AM, Vlastimil Babka wrote:
>> On 12/30/23 17:19, Michael Roth wrote:
>>> From: Ashish Kalra <ashish.kalra@xxxxxxx>
>>>
>>> Pages are unsafe to be released back to the page-allocator, if they
>>> have been transitioned to firmware/guest state and can't be reclaimed
>>> or transitioned back to hypervisor/shared state. In this case add
>>> them to an internal leaked pages list to ensure that they are not freed
>>> or touched/accessed to cause fatal page faults.
>>>
>>> Signed-off-by: Ashish Kalra <ashish.kalra@xxxxxxx>
>>> [mdr: relocate to arch/x86/virt/svm/sev.c]
>>> Signed-off-by: Michael Roth <michael.roth@xxxxxxx>
>> Hi, sorry I didn't respond in time to the last mail discussing previous
>> version in
>> https://lore.kernel.org/all/8c1fd8da-912a-a9ce-9547-107ba8a450fc@xxxxxxx/
>> due to upcoming holidays.
>>
>> I would rather avoid the approach of allocating container objects:
>> - it's allocating memory when effectively losing memory, a dangerous thing
>> - are all the callers and their context ok with GFP_KERNEL?
>> - GFP_KERNEL_ACCOUNT seems wrong, why would we be charging this to the
>> current process, it's probably not its fault the pages are leaked? Also the
>> charging can fail?
>> - given the benefit of having leaked pages on a list is basically just
>> debugging (i.e. crash dump or drgn inspection) this seems too heavy
>>
>> I think it would be better and sufficient to use page->lru for order-0 and
>> head pages, and simply skip tail pages (possibly with adjusted warning
>> message for that case).
>>
>> Vlastimil
>>
>> <snip
>
> Considering the above thoughts, this is updated version of
> snp_leak_pages(), looking forward to any review comments/feedback you
> have on the same:
>
> void snp_leak_pages(u64 pfn, unsigned int npages)
> {
>         struct page *page = pfn_to_page(pfn);
>
>         pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__,
> pfn, pfn + npages);
>
>         spin_lock(&snp_leaked_pages_list_lock);
>         while (npages--) {
>                 /*
>                  * Reuse the page's buddy list for chaining into the leaked
>                  * pages list. This page should not be on a free list
> currently
>                  * and is also unsafe to be added to a free list.
>                  */
>                 if ((likely(!PageCompound(page))) || (PageCompound(page) &&
>                     !PageTail(page) && compound_head(page) == page))

This is unnecessarily paranoid wrt that compound_head(page) test, but OTOH
doesn't handle the weird case when we're leaking less than whole compound
page (if that can even happen). So I'd suggest:

while (npages) {

if ((likely(!PageCompound(page))) || (PageHead(page) && compound_nr(page)
<= npages))
list_add_tail(&page->buddy_list, ...)
}

... (no change from yours)

npages--;
}

(or an equivalent for()) perhaps

>                         /*
>                          * Skip inserting tail pages of compound page as
>                          * page->buddy_list of tail pages is not usable.
>                          */
>                         list_add_tail(&page->buddy_list,
> &snp_leaked_pages_list);
>                 sev_dump_rmpentry(pfn);
>                 snp_nr_leaked_pages++;
>                 pfn++;
>                 page++;
>         }
>         spin_unlock(&snp_leaked_pages_list_lock);
> }
>
> Thanks, Ashish
>