Re: [PATCH 6/9] VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.

From: dmitry.torokhov@xxxxxxxxx
Date: Fri Jun 12 2015 - 11:31:23 EST


On Fri, Jun 12, 2015 at 03:06:56PM +0000, Philip Moltmann wrote:
> Hi,
>
> thanks for taking so much interest in this driver. It is quite good
> that our design choices get scrutinized by non-current VMware
> employees.
>
>
> > I understand that you negotiate the capabilities between hypervisor
> > and
> > the balloon driver, however that was not my concern (and I am sorry
> > that
> > I did not express it properly).
> >
> > The patch description stated:
> >
> > "Before this patch the slow memory transfer would cause the
> > destination
> > VM to have internal swapping until all memory is transferred. Now the
> > memory is transferred fast enough so that the destination VM does not
> > swap."
> >
> > As far as I understand the improvements in memory transfer speed
> > hinge
> > on the availability of batched operations, you however remove the
> > limits
> > on non-sleep allocations unconditionally. Thus my question: on older
> > ESXi's that do not support batcher operations won't this cause VM to
> > start swapping?
>
> Three improvements contribute to the overall faster speed:
> - batched operations reduce the hypervisor overhead per page
> - 2m instead of 4k buffer reduce the hypervisor overhead per page
> - removing the rate-limiting for non-sleep allocations allows the guest
> operating system to reclaim memory as fast as it can instead of
> artificially limiting it.
>
> Any of these improvements is great by itself and helps a lot. The
> combination of all three makes a rather dramatic difference.
>
> We cause hypervisor-level swapping if the balloon driver does not
> reclaim fast enough. As any of these improvements increases reclamation
> speed, we reduce swapping risk in any case.
>
> Unfortunately the first two improvements rely on hypervisor support,
> the last does not.

As far as I can understand the justification for removing the limit
(improvement #3) is that we have #1 and #2, at least that's how I read
the patch description. I am saying: what if you running on a hypervisor
that does not support neither #1 nor #2? What was the first release that
of ESXi supports batching and 2M pages? What about workstation (I don't
recall if it started using ballooning at some point)?

Thanks.

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/