Re: [PATCH v2 net-next 2/3] skbuff: (re)use NAPI skb cache on allocation path

From: Alexander Lobakin
Date: Thu Jan 14 2021 - 07:45:39 EST


From: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Date: Thu, 14 Jan 2021 12:47:31 +0100

> On Thu, Jan 14, 2021 at 12:41 PM Alexander Lobakin <alobakin@xxxxx> wrote:
>>
>> From: Eric Dumazet <edumazet@xxxxxxxxxx>
>> Date: Wed, 13 Jan 2021 15:36:05 +0100
>>
>>> On Wed, Jan 13, 2021 at 2:37 PM Alexander Lobakin <alobakin@xxxxx> wrote:
>>>>
>>>> Instead of calling kmem_cache_alloc() every time when building a NAPI
>>>> skb, (re)use skbuff_heads from napi_alloc_cache.skb_cache. Previously
>>>> this cache was only used for bulk-freeing skbuff_heads consumed via
>>>> napi_consume_skb() or __kfree_skb_defer().
>>>>
>>>> Typical path is:
>>>> - skb is queued for freeing from driver or stack, its skbuff_head
>>>> goes into the cache instead of immediate freeing;
>>>> - driver or stack requests NAPI skb allocation, an skbuff_head is
>>>> taken from the cache instead of allocation.
>>>>
>>>> Corner cases:
>>>> - if it's empty on skb allocation, bulk-allocate the first half;
>>>> - if it's full on skb consuming, bulk-wipe the second half.
>>>>
>>>> Also try to balance its size after completing network softirqs
>>>> (__kfree_skb_flush()).
>>>
>>> I do not see the point of doing this rebalance (especially if we do not change
>>> its name describing its purpose more accurately).
>>>
>>> For moderate load, we will have a reduced bulk size (typically one or two).
>>> Number of skbs in the cache is in [0, 64[ , there is really no risk of
>>> letting skbs there for a long period of time.
>>> (32 * sizeof(sk_buff) = 8192)
>>> I would personally get rid of this function completely.
>>
>> When I had a cache of 128 entries, I had worse results without this
>> function. But seems like I forgot to retest when I switched to the
>> original size of 64.
>> I also thought about removing this function entirely, will test.
>>
>>> Also it seems you missed my KASAN support request ?
>> I guess this is a matter of using kasan_unpoison_range(), we can ask for help.
>>
>> I saw your request, but don't see a reason for doing this.
>> We are not caching already freed skbuff_heads. They don't get
>> kmem_cache_freed before getting into local cache. KASAN poisons
>> them no earlier than at kmem_cache_free() (or did I miss someting?).
>> heads being cached just get rid of all references and at the moment
>> of dropping to the cache they are pretty the same as if they were
>> allocated.
>
> KASAN should not report false positives in this case.
> But I think Eric meant preventing false negatives. If we kmalloc 17
> bytes, KASAN will detect out-of-bounds accesses beyond these 17 bytes.
> But we put that data into 128-byte blocks, KASAN will miss
> out-of-bounds accesses beyond 17 bytes up to 128 bytes.
> The same holds for "logical" use-after-frees when object is free, but
> not freed into slab.
>
> An important custom cache should use annotations like
> kasan_poison_object_data/kasan_unpoison_range.

As I understand, I should
kasan_poison_object_data(skbuff_head_cache, skb) and then
kasan_unpoison_range(skb, sizeof(*skb)) when putting it into the
cache?

Thanks,
Al