Re: [RESEND PATCH net] mm: page_alloc: fix ref bias in page_frag_alloc() for 1-byte allocs

From: David Miller
Date: Thu Feb 14 2019 - 17:21:05 EST


From: Jann Horn <jannh@xxxxxxxxxx>
Date: Thu, 14 Feb 2019 22:26:22 +0100

> On Thu, Feb 14, 2019 at 6:13 PM David Miller <davem@xxxxxxxxxxxxx> wrote:
>>
>> From: Jann Horn <jannh@xxxxxxxxxx>
>> Date: Wed, 13 Feb 2019 22:45:59 +0100
>>
>> > The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
>> > number of references that we might need to create in the fastpath later,
>> > the bump-allocation fastpath only has to modify the non-atomic bias value
>> > that tracks the number of extra references we hold instead of the atomic
>> > refcount. The maximum number of allocations we can serve (under the
>> > assumption that no allocation is made with size 0) is nc->size, so that's
>> > the bias used.
>> >
>> > However, even when all memory in the allocation has been given away, a
>> > reference to the page is still held; and in the `offset < 0` slowpath, the
>> > page may be reused if everyone else has dropped their references.
>> > This means that the necessary number of references is actually
>> > `nc->size+1`.
>> >
>> > Luckily, from a quick grep, it looks like the only path that can call
>> > page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
>> > requires CAP_NET_ADMIN in the init namespace and is only intended to be
>> > used for kernel testing and fuzzing.
>> >
>> > To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
>> > `offset < 0` path, below the virt_to_page() call, and then repeatedly call
>> > writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
>> > with a vector consisting of 15 elements containing 1 byte each.
>> >
>> > Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
>>
>> Applied and queued up for -stable.
>
> I had sent a v2 at Alexander Duyck's request an hour before you
> applied the patch (with a minor difference that, in Alexander's
> opinion, might be slightly more efficient). I guess the net tree
> doesn't work like the mm tree, where patches can get removed and
> replaced with newer versions? So if Alexander wants that change
> (s/size/PAGE_FRAG_CACHE_MAX_SIZE/ in the refcount), someone has to
> send that as a separate patch?

Yes, please send a follow-up. Sorry about that.