Re: [PATCH v2] virtio_balloon: Fix endless deflation and inflation on arm64

From: Michael S. Tsirkin
Date: Mon Oct 02 2023 - 18:30:59 EST


On Mon, Oct 02, 2023 at 01:50:45PM +0200, David Hildenbrand wrote:
> On 25.09.23 01:58, Gavin Shan wrote:
> > Hi David and Michael,
> >
> > On 8/31/23 11:10, Gavin Shan wrote:
> > > The deflation request to the target, which isn't unaligned to the
> > > guest page size causes endless deflation and inflation actions. For
> > > example, we receive the flooding QMP events for the changes on memory
> > > balloon's size after a deflation request to the unaligned target is
> > > sent for the ARM64 guest, where we have 64KB base page size.
> > >
> > > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \
> > > -accel kvm -machine virt,gic-version=host -cpu host \
> > > -smp maxcpus=8,cpus=8,sockets=2,clusters=2,cores=2,threads=1 \
> > > -m 1024M,slots=16,maxmem=64G \
> > > -object memory-backend-ram,id=mem0,size=512M \
> > > -object memory-backend-ram,id=mem1,size=512M \
> > > -numa node,nodeid=0,memdev=mem0,cpus=0-3 \
> > > -numa node,nodeid=1,memdev=mem1,cpus=4-7 \
> > > : \
> > > -device virtio-balloon-pci,id=balloon0,bus=pcie.10
> > >
> > > { "execute" : "balloon", "arguments": { "value" : 1073672192 } }
> > > {"return": {}}
> > > {"timestamp": {"seconds": 1693272173, "microseconds": 88667}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272174, "microseconds": 89704}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272175, "microseconds": 90819}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272176, "microseconds": 91961}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272177, "microseconds": 93040}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
> > > {"timestamp": {"seconds": 1693272178, "microseconds": 94117}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
> > > {"timestamp": {"seconds": 1693272179, "microseconds": 95337}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272180, "microseconds": 96615}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
> > > {"timestamp": {"seconds": 1693272181, "microseconds": 97626}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272182, "microseconds": 98693}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
> > > {"timestamp": {"seconds": 1693272183, "microseconds": 99698}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272184, "microseconds": 100727}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272185, "microseconds": 90430}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > {"timestamp": {"seconds": 1693272186, "microseconds": 102999}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
> > > :
> > > <The similar QMP events repeat>
> > >
> > > Fix it by aligning the target up to the guest page size, 64KB in this
> > > specific case. With this applied, no flooding QMP events are observed
> > > and the memory balloon's size can be stablizied to 0x3ffe0000 soon
> > > after the deflation request is sent.
> > >
> > > { "execute" : "balloon", "arguments": { "value" : 1073672192 } }
> > > {"return": {}}
> > > {"timestamp": {"seconds": 1693273328, "microseconds": 793075}, \
> > > "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
> > > { "execute" : "query-balloon" }
> > > {"return": {"actual": 1073610752}}
> > >
> > > Signed-off-by: Gavin Shan <gshan@xxxxxxxxxx>
> > > Tested-by: Zhenyu Zhang <zhenyzha@xxxxxxxxxx>
> > > ---
> > > v2: Align @num_pages up to the guest page size in towards_target()
> > > directly as David suggested.
> > > ---
> > > drivers/virtio/virtio_balloon.c | 6 +++++-
> > > 1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> >
> > If the patch looks good, could you please merge this to Linux 6.6.rc4 since
> > it's something needed by our downstream. I hope it can land upstream as early
> > as possible, thanks a lot.
>
> @MST, I cannot spot it in your usual vhost git yet. Should I pick it up or
> what are your plans?

Yes - I merged it but I'm still testing my tree. Will be in
the next pull request.