Re: Possible netns creation and execution performance/scalability regression since v3.8 due to rcu callbacks being offloaded to multiple cpus

From: Eric W. Biederman
Date: Wed Jun 11 2014 - 03:08:41 EST


Rafael Tinoco <rafael.tinoco@xxxxxxxxxxxxx> writes:

> Paul E. McKenney, Eric Biederman, David Miller (and/or anyone else interested):
>
> It was brought to my attention that netns creation/execution might
> have suffered scalability/performance regression after v3.8.
>
> I would like you, or anyone interested, to review these charts/data
> and check if there is something that could be discussed/said before I
> move further.
>
> The following script was used for all the tests and charts generation:

> ====
> #!/bin/bash
> IP=/sbin/ip
>
> function add_fake_router_uuid() {
> j=`uuidgen`
> $IP netns add bar-${j}
> $IP netns exec bar-${j} $IP link set lo up
> $IP netns exec bar-${j} sysctl -w net.ipv4.ip_forward=1 > /dev/null
> k=`echo $j | cut -b -11`
> $IP link add qro-${k} type veth peer name qri-${k} netns bar-${j}
> $IP link add qgo-${k} type veth peer name qgi-${k} netns bar-${j}
> }
>
> for i in `seq 1 $1`; do
> if [ `expr $i % 250` -eq 0 ]; then
> echo "$i by `date +%s`"
> fi
> add_fake_router_uuid
> done

[snip long explanation]

> I was able to see that, from the script above, the following lines
> causes major impact on netns scalability/performance:
>
> 1) ip netns add -> huge performance regression:
>
> 1 cpu: no regression
> 4 cpu: regression for NOCB_CPU_ALL
>
> obs: regression from 250 netns/sec to 50 netns/sec on 500 netns
> already created mark

copy_netns except possibly in the per_net callbacks does not use
rcu so I am mystified. So a little more digging to figure out which
rcu usage is causing the problem would be very interesting.

> 2) ip netns exec -> some performance regression
>
> 1 cpu: no regression
> 4 cpu: regression for NOCB_CPU_ALL
>
> obs: regression from 40 netns (+1 exec per netns creation) to 20
> netns/sec on 500 netns created mark

The performance regression is probably in setns().
switch_task_namespaces is occassionaly a choke point.

At one point I was playing with ideas on how to use the task lock
instead of rcu to protect nsproxy. As the original reason we could
not use task_lock appeared to have disappeared.

That could be worth playing with.


> ========
>
> FULL NOTE: http://people.canonical.com/~inaddy/lp1328088/
>
> ** Assumption: RCU callbacks being offloaded to multiple cpus
> (cpumask_setall) caused regression in
> copy_net_ns<-created_new_namespaces or unshare(clone_newnet).
>
> ** Next Steps: I'll probably begin to function_graph netns creation execution

Eric


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/