Re: [GIT PULL] VFIO fix for v6.0-rc5

From: John Hubbard
Date: Fri Sep 09 2022 - 14:58:28 EST


On 9/9/22 05:02, David Hildenbrand wrote:
> On 09.09.22 13:53, Linus Torvalds wrote:
>> On Fri, Sep 9, 2022 at 6:52 AM Alex Williamson
>> <alex.williamson@xxxxxxxxxx> wrote:
>>>
>>> VFIO fix for v6.0-rc5
>>>
>>> - Fix zero page refcount leak (Alex Williamson)
>>
>> Ugh. This is disgusting.
>>
>> Don't get me wrong - I've pulled this, but I think there's some deeper
>> problem that made this patch required.
>>
>> Why is pin_user_pages_remote() taking a reference to a reserved page?
>> Maybe it just shouldn't (and then obviously we should fix the unpin
>> case to match too).
>>
>> Adding a few GUP people to the participants for comments.
>>
>> Anybody?
>
> I mentioned in an offline discussion to Alex that we should teach the
> pin/unpin interface to not mess with the zeropage at all (i.e., not
> adjust the refcount and eventually overflow it).

I don't think this is part of the problem. And I sure hope that it's
not. If you can make such a change, and contain it within the gup.c
layer so that callers can still think they are pinning the zero page,
then OK, that works.

But generally, callers expect to be able to pin the zero page, or to at
least believe that they've done so. :) Sorting it out and treating it
separately requires larger changes to probably quite a lot of subsystems.

As a case in point, very close to my heart, I'm about to add a
pin_user_page*() caller to block/bio, that pins the zero page, after
discussing it with block/fs people [1].

>
> We decided that the unbalanced pin/unpin should be fixed independently,
> such that the refcount handling change on pin/unpin stays GUP internal.
>
>

OK, good.

[1] https://lore.kernel.org/all/20220124221709.kzsaqkdp3gmjie3z@xxxxxxxxxx/

thanks,

--
John Hubbard
NVIDIA