Re: [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG

From: Tetsuo Handa
Date: Sat Dec 23 2017 - 23:45:19 EST


Matthew Wilcox wrote:
> > + unsigned long pfn = page_to_pfn(page);
> > + int ret;
> > +
> > + *pfn_min = min(pfn, *pfn_min);
> > + *pfn_max = max(pfn, *pfn_max);
> > +
> > + do {
> > + if (xb_preload(GFP_NOWAIT | __GFP_NOWARN) < 0)
> > + return -ENOMEM;
> > +
> > + ret = xb_set_bit(&vb->page_xb, pfn);
> > + xb_preload_end();
> > + } while (unlikely(ret == -EAGAIN));
>
> OK, so you don't need a spinlock because you're under a mutex? But you
> can't allocate memory because you're in the balloon driver, and so a
> GFP_KERNEL allocation might recurse into your driver?

Right. We can't (directly or indirectly) depend on __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
allocations because the balloon driver needs to handle OOM notifier callback.

> Would GFP_NOIO
> do the job? I'm a little hazy on exactly how the balloon driver works.

GFP_NOIO implies __GFP_DIRECT_RECLAIM. In the worst case, it can lockup due to
the too small to fail memory allocation rule. GFP_NOIO | __GFP_NORETRY would work
if there is really a guarantee that GFP_NOIO | __GFP_NORETRY never depend on
__GFP_DIRECT_RECLAIM && !__GFP_NORETRY allocations, which is too subtle for me to
validate. The direct reclaim dependency is too complicated to validate.
I consider that !__GFP_DIRECT_RECLAIM is the future-safe choice.

>
> If you can't preload with anything better than that, I think that
> xb_set_bit() should attempt an allocation with GFP_NOWAIT | __GFP_NOWARN,
> and then you can skip the preload; it has no value for you.

Yes, that's why I suggest directly using kzalloc() at xb_set_bit().

>
> > @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >
> > while ((page = balloon_page_pop(&pages))) {
> > balloon_page_enqueue(&vb->vb_dev_info, page);
> > + if (use_sg) {
> > + if (xb_set_page(vb, page, &pfn_min, &pfn_max) < 0) {
> > + __free_page(page);
> > + continue;
> > + }
> > + } else {
> > + set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > + }
>
> Is this the right behaviour?

I don't think so. In the worst case, we can set no bit using xb_set_page().

> If we can't record the page in the xb,
> wouldn't we rather send it across as a single page?
>

I think that we need to be able to fallback to !use_sg path when OOM.