Re: [PATCH] af_packet: Raw socket destruction warning fix

From: Eric Dumazet
Date: Wed Feb 10 2016 - 09:56:13 EST


On Wed, 2016-02-10 at 12:43 +0000, Vaneet Narang wrote:
> Hi,
>
> >What driver are you using (is that in-tree)? Can you reproduce the same issue
> >with a latest -net kernel, for example (or, a 'reasonably' recent one like 4.3 or
> >4.4)? There has been quite a bit of changes in err queue handling (which also
> >accounts rmem) as well. How reliably can you trigger the issue? Does it trigger
> >with a completely different in-tree network driver as well with your tests? Would
> >be useful to track/debug sk_rmem_alloc increases/decreases to see from which path
> >new rmem is being charged in the time between packet_release() and packet_sock_destruct()
> >for that socket ...
> >
> It seems race condition to us between packet_rcv and packet_close, we have tried to reproduce
> this issue by adding delay in skb_set_owner_r and issue gets reproduced quite frequently.
> we have added some traces and on analyzing we have realised following possible race condition.



Even if you add a delay in skb_set_owner_r(), this should not allow the
dismantle phase to complete, since at least one cpu is still in a
rcu_read_lock() section.

synchronize_rcu() must complete only when all cpus pass an rcu quiescent
point.

packet_close() should certainly not be called while another cpu is still
in the middle of packet_rcv()

Your patch does not address the root cause.