Re: [Xen-devel] [PATCH v3] xen/balloon: Mark unallocated host memory as UNUSABLE

From: Boris Ostrovsky
Date: Mon Nov 26 2018 - 11:25:58 EST


On 11/25/18 8:00 PM, Igor Druzhinin wrote:
> On 20/12/2017 14:05, Boris Ostrovsky wrote:
>> Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum
>> reservation") left host memory not assigned to dom0 as available for
>> memory hotplug.
>>
>> Unfortunately this also meant that those regions could be used by
>> others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR
>> on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those
>> addresses as MMIO.
>>
>> To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus
>> effectively reverting f5775e0b6116) and keep track of that region as
>> a hostmem resource that can be used for the hotplug.
>>
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
> This commit breaks Xen balloon memory hotplug for us in Dom0 with
> "hoplug_unpopulated" set to 1. The issue is that the common kernel
> memory onlining procedures require "System RAM" resource to be 1-st
> level. That means by inserting it under "Unusable memory" as the commit
> above does (intentionally or not) we make it 2-nd level and break memory
> onlining.

What do you mean by 1st and 2nd level?



>
> There are multiple ways to fix it depending on what was the intention of
> original commit and what exactly it tried to workaround. It seems it
> does several things at once:
> 1) Marks non-Dom0 host memory "Unusable memory" in resource tree.
> 2) Keeps track of all the areas safe for hotplug in Dom0
> 3) Changes allocation algorithms itself in balloon driver to use those areas

Pretty much. (3) is true in the sense that memory is first allocated
from hostmem_resource (which is non-dom0 RAM).


>
> Are all the things above necessary to cover the issue in fa564ad96366
> ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
> 60-7f)")?

Not anymore, as far as that particular commit is concerned, but that's
because of 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to
avoid conflict") which was introduced after balloon patch. IIRC there
were some issues with fa564ad96366unrelated to balloon.


>
> Can we remove "Unusable memory" resources as soon as we finished
> booting? Is removing on-demand is preferable over "shoot them all" in
> that case?

The concern is that in principle nothing prevents someone else to do
exact same thing fa564ad96366 did, which is grab something from right
above end of RAM as the kernel sees it. And that can be done at any point.


-boris

>
> Does it even make sense to remove the 1-st level only restriction in
> kernel/resource.c ?