Re: [RFC 2/2] Drivers: hv: balloon: Disable balloon and hot-add accordingly

From: Boqun Feng
Date: Wed Feb 23 2022 - 21:45:04 EST


On Wed, Feb 23, 2022 at 04:55:25PM +0000, Michael Kelley (LINUX) wrote:
> From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Wednesday, February 23, 2022 5:16 AM
> >
> > Currently there are known potential issues for balloon and hot-add on
> > ARM64:
> >
> > * Unballoon requests from Hyper-V should only unballoon ranges
> > that are guest page size aligned, otherwise guests cannot handle
> > because it's impossible to partially free a page.
> >
> > * Memory hot-add requests from Hyper-V should provide the NUMA
> > node id of the added ranges or ARM64 should have a functional
> > memory_add_physaddr_to_nid(), otherwise the node id is missing
> > for add_memory().
> >
> > These issues require discussions on design and implementation. In the
> > meanwhile, post_status() is working and essiential to guest monitoring.
> > Therefore instead of the entire hv_balloon driver, the balloon and
> > hot-add are disabled accordingly for now. Once the issues are fixed,
> > they can be re-enable in these cases.
> >
> > Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> > ---
> > drivers/hv/hv_balloon.c | 14 ++++++++++++--
> > 1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> > index 062156b88a87..35dcda20be85 100644
> > --- a/drivers/hv/hv_balloon.c
> > +++ b/drivers/hv/hv_balloon.c
> > @@ -1730,9 +1730,19 @@ static int balloon_connect_vsp(struct hv_device *dev)
> > * When hibernation (i.e. virtual ACPI S4 state) is enabled, the host
> > * currently still requires the bits to be set, so we have to add code
> > * to fail the host's hot-add and balloon up/down requests, if any.
> > + *
> > + * We disable balloon if the page size is larger than 4k, since
> > + * currently it's unclear to us whether an unballoon request can make
> > + * sure all page ranges are guest page size aligned.
> > + *
> > + * We also disable hot add on ARM64, because we currently rely on
> > + * memory_add_physaddr_to_nid() to get a node id of a hot add range,
> > + * however ARM64's memory_add_physaddr_to_nid() always return 0 and
> > + * DM_MEM_HOT_ADD_REQUEST doesn't have the NUMA node information for
> > + * add_memory().
> > */
> > - cap_msg.caps.cap_bits.balloon = 1;
> > - cap_msg.caps.cap_bits.hot_add = 1;
> > + cap_msg.caps.cap_bits.balloon = !(PAGE_SIZE > 4096UL);
>
> Any reasons not to use HV_HYP_PAGE_SIZE vs. open coding "4096"? So
>
> cap_msg.caps.cap_bits.balloon = (PAGE_SIZE == HV_HYP_PAGE_SIZE);
>

You're right. I will change that to it in the next version.

> > + cap_msg.caps.cap_bits.hot_add = !IS_ENABLED(CONFIG_ARM64);
>
> I think we should output a message so that there's no mystery as to
> whether ballooning and/or hot_add are disabled, and why. Each setting
> should have its own message. Maybe something like:
>
> if (!cap_msg.caps.cap_bits.balloon)
> pr_info("Ballooning disabled because page size is not 4096 bytes\n");
>
> if (!cap_msg.cap_bits.hot_add)
> pr_info("Memory hot add disabled on ARM64\n");
>

I agree with your suggestion, however, while I'm at it, I think it's
better that we have functions that check and print, and .balloon and
.hot_add can rely on the return value, for example:

static int balloon_enabled(void)
{
if (PAGE_SIZE != HV_HYP_PAGE_SIZE) {
pr_info("Ballooning disabled because page size is not 4096 bytes\n");
return 0;
}

return 1;
}

// in balloon_vsp_connect()

cap_msg.caps.cap_bits.balloon = balloon_enabled();

In this way, we keep the checking and reason printing in the same
function and it's easier to maintain the consistency.

Thoughts?

Regards,
Boqun

> >
> > /*
> > * Specify our alignment requirements as it relates
> > --
> > 2.35.1
>