RE: Scaling problem with a lot of AF_PACKET sockets on different interfaces

From: David Laight
Date: Fri Jun 07 2013 - 09:31:46 EST


> > I have a Linux router with a lot of interfaces (hundreds or
> > thousands of VLANs) and an application that creates AF_PACKET
> > socket per interface and bind()s sockets to interfaces.
...
> > I noticed that box has strange performance problems with
> > most of the CPU time spent in __netif_receive_skb:
> > 86.15% [k] __netif_receive_skb
> > 1.41% [k] _raw_spin_lock
> > 1.09% [k] fib_table_lookup
> > 0.99% [k] local_bh_enable_ip
...
> > This corresponds to:
> >
> > net/core/dev.c:
> > type = skb->protocol;
> > list_for_each_entry_rcu(ptype,
> > &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
> > if (ptype->type == type &&
> > (ptype->dev == null_or_dev || ptype->dev == skb->dev ||
> > ptype->dev == orig_dev)) {
> > if (pt_prev)
> > ret = deliver_skb(skb, pt_prev, orig_dev);
> > pt_prev = ptype;
> > }
> > }
> >
> > Which works perfectly OK until there are a lot of AF_PACKET sockets, since
> > the socket adds a protocol to ptype list:

Presumably the 'ethertype' is the same for all the sockets?
(And probably the '& PTYPE_HASH_MASH' doesn't separate it from 0800
or 0806 (IIRC IP and ICMP))

How often is that deliver_skb() inside the loop called?
If the code could be arranged so that the scan loop didn't contain
a function call then the loop code would be a lot faster since
the compiler can cache values in registers.
While that woukd speed the code up somewhat, there would still be a
significant cost to iterate 1000+ times.

Looks like the ptype_base[] should be per 'dev'?
Or just put entries where ptype->dev != null_or_dev on a per-interface
list and do two searches?

David

èº{.nÇ+‰·Ÿ®‰­†+%ŠËlzwm…ébëæìr¸›zX§»®w¥Š{ayºÊÚë,j­¢f£¢·hš‹àz¹®w¥¢¸ ¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝj"ú!¶iO•æ¬z·švØ^¶m§ÿðà nÆàþY&—