RE: [RFC 2/2] Drivers: hv: balloon: Disable balloon and hot-add accordingly

From: Michael Kelley (LINUX)
Date: Wed Feb 23 2022 - 23:44:45 EST


From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Wednesday, February 23, 2022 6:44 PM
>
> On Wed, Feb 23, 2022 at 04:55:25PM +0000, Michael Kelley (LINUX) wrote:
> > From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Wednesday, February 23, 2022
> 5:16 AM
> > >
> > > Currently there are known potential issues for balloon and hot-add on
> > > ARM64:
> > >
> > > * Unballoon requests from Hyper-V should only unballoon ranges
> > > that are guest page size aligned, otherwise guests cannot handle
> > > because it's impossible to partially free a page.
> > >
> > > * Memory hot-add requests from Hyper-V should provide the NUMA
> > > node id of the added ranges or ARM64 should have a functional
> > > memory_add_physaddr_to_nid(), otherwise the node id is missing
> > > for add_memory().
> > >
> > > These issues require discussions on design and implementation. In the
> > > meanwhile, post_status() is working and essiential to guest monitoring.
> > > Therefore instead of the entire hv_balloon driver, the balloon and
> > > hot-add are disabled accordingly for now. Once the issues are fixed,
> > > they can be re-enable in these cases.
> > >
> > > Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> > > ---
> > > drivers/hv/hv_balloon.c | 14 ++++++++++++--
> > > 1 file changed, 12 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> > > index 062156b88a87..35dcda20be85 100644
> > > --- a/drivers/hv/hv_balloon.c
> > > +++ b/drivers/hv/hv_balloon.c
> > > @@ -1730,9 +1730,19 @@ static int balloon_connect_vsp(struct hv_device *dev)
> > > * When hibernation (i.e. virtual ACPI S4 state) is enabled, the host
> > > * currently still requires the bits to be set, so we have to add code
> > > * to fail the host's hot-add and balloon up/down requests, if any.
> > > + *
> > > + * We disable balloon if the page size is larger than 4k, since
> > > + * currently it's unclear to us whether an unballoon request can make
> > > + * sure all page ranges are guest page size aligned.
> > > + *
> > > + * We also disable hot add on ARM64, because we currently rely on
> > > + * memory_add_physaddr_to_nid() to get a node id of a hot add range,
> > > + * however ARM64's memory_add_physaddr_to_nid() always return 0 and
> > > + * DM_MEM_HOT_ADD_REQUEST doesn't have the NUMA node information
> for
> > > + * add_memory().
> > > */
> > > - cap_msg.caps.cap_bits.balloon = 1;
> > > - cap_msg.caps.cap_bits.hot_add = 1;
> > > + cap_msg.caps.cap_bits.balloon = !(PAGE_SIZE > 4096UL);
> >
> > Any reasons not to use HV_HYP_PAGE_SIZE vs. open coding "4096"? So
> >
> > cap_msg.caps.cap_bits.balloon = (PAGE_SIZE == HV_HYP_PAGE_SIZE);
> >
>
> You're right. I will change that to it in the next version.
>
> > > + cap_msg.caps.cap_bits.hot_add = !IS_ENABLED(CONFIG_ARM64);
> >
> > I think we should output a message so that there's no mystery as to
> > whether ballooning and/or hot_add are disabled, and why. Each setting
> > should have its own message. Maybe something like:
> >
> > if (!cap_msg.caps.cap_bits.balloon)
> > pr_info("Ballooning disabled because page size is not 4096 bytes\n");
> >
> > if (!cap_msg.cap_bits.hot_add)
> > pr_info("Memory hot add disabled on ARM64\n");
> >
>
> I agree with your suggestion, however, while I'm at it, I think it's
> better that we have functions that check and print, and .balloon and
> .hot_add can rely on the return value, for example:
>
> static int balloon_enabled(void)
> {
> if (PAGE_SIZE != HV_HYP_PAGE_SIZE) {
> pr_info("Ballooning disabled because page size is not 4096 bytes\n");
> return 0;
> }
>
> return 1;
> }
>
> // in balloon_vsp_connect()
>
> cap_msg.caps.cap_bits.balloon = balloon_enabled();
>
> In this way, we keep the checking and reason printing in the same
> function and it's easier to maintain the consistency.
>
> Thoughts?

Yes, that approach looks good to me.

Michael