çå: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition

From: Li,Rongqing
Date: Sun Dec 15 2019 - 23:02:46 EST




> -----éäåä-----
> åää: Yunsheng Lin [mailto:linyunsheng@xxxxxxxxxx]
> åéæé: 2019å12æ16æ 9:51
> æää: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
> æé: Li,Rongqing <lirongqing@xxxxxxxxx>; Saeed Mahameed
> <saeedm@xxxxxxxxxxxx>; ilias.apalodimas@xxxxxxxxxx;
> jonathan.lemon@xxxxxxxxx; netdev@xxxxxxxxxxxxxxx; mhocko@xxxxxxxxxx;
> peterz@xxxxxxxxxxxxx; Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>;
> bhelgaas@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; BjÃrn TÃpel
> <bjorn.topel@xxxxxxxxx>
> äé: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE
> condition
>
> On 2019/12/13 16:48, Jesper Dangaard Brouer wrote:> You are basically saying
> that the NUMA check should be moved to
> > allocation time, as it is running the RX-CPU (NAPI). And eventually
> > after some time the pages will come from correct NUMA node.
> >
> > I think we can do that, and only affect the semi-fast-path.
> > We just need to handle that pages in the ptr_ring that are recycled
> > can be from the wrong NUMA node. In __page_pool_get_cached() when
> > consuming pages from the ptr_ring (__ptr_ring_consume_batched), then
> > we can evict pages from wrong NUMA node.
>
> Yes, that's workable.
>
> >
> > For the pool->alloc.cache we either accept, that it will eventually
> > after some time be emptied (it is only in a 100% XDP_DROP workload that
> > it will continue to reuse same pages). Or we simply clear the
> > pool->alloc.cache when calling page_pool_update_nid().
>
> Simply clearing the pool->alloc.cache when calling page_pool_update_nid()
> seems better.
>

How about the below codes, the driver can configure p.nid to any, which will be adjusted in NAPI polling, irq migration will not be problem, but it will add a check into hot path.

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index a6aefe989043..4374a6239d17 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -108,6 +108,10 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
if (likely(pool->alloc.count)) {
/* Fast-path */
page = pool->alloc.cache[--pool->alloc.count];
+
+ if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
+ WRITE_ONCE(pool->p.nid, numa_mem_id());
+
return page;
}
refill = true;
@@ -155,6 +159,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
if (pool->p.order)
gfp |= __GFP_COMP;

+
+ if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
+ WRITE_ONCE(pool->p.nid, numa_mem_id());
+
/* FUTURE development:
*
* Current slow-path essentially falls back to single page
Thanks

-Li
> >