Re: [PATCH] neighbour: guarantee the localhost connections be established successfully even the ARP table is full

From: Ratheesh Kannoth
Date: Mon Mar 11 2024 - 09:52:49 EST


On 2024-03-11 at 17:54:01, Zheng Li (lizheng043@xxxxxxxxx) wrote:
>
> Inter-process communication on localhost should be established successfully even the ARP table is full,
> many processes on server machine use the localhost to communicate such as command-line interface (CLI),
> servers hope all CLI commands can be executed successfully even the arp table is full.
> Right now CLI commands got timeout when the arp table is full.
> Set the parameter of exempt_from_gc to be true for LOOPBACK net device to
> keep localhost neigh in arp table, not removed by gc.
>
> the steps of reproduced:
> server with "gc_thresh3 = 1024" setting, ping server from more than 1024 IPv4 addresses,
> run "ssh localhost" on console interface, then the command will get timeout.
it does not look correct to me. why gc has to behave differently for loopback devices.
why can't a higher gc_thresh3 value (fine tuned to your use case) wont solve the issue ?
can't you add localhost arp entry statically and get rid of gc issue ?

>
> Signed-off-by: Zheng Li <James.Z.Li@xxxxxxxx>
> ---
> net/core/neighbour.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 552719c3bbc3..d96dee3d4af6 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -734,7 +734,10 @@ ___neigh_create(struct neigh_table *tbl, const void *pkey,
> struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
> struct net_device *dev, bool want_ref)
> {
> - return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
> + if (dev->flags & IFF_LOOPBACK)
> + return ___neigh_create(tbl, pkey, dev, 0, true, want_ref);
> + else
> + return ___neigh_create(tbl, pkey, dev, 0, false, want_ref);
> }
> EXPORT_SYMBOL(__neigh_create);
>
> --
> 2.17.1
>