Re: [PATCH AUTOSEL 4.19 46/84] tcp/dccp: fix possible race __inet_lookup_established()

From: Michal Kubecek
Date: Thu Jan 09 2020 - 12:07:51 EST


On Thu, Jan 09, 2020 at 10:32:26AM -0500, Sasha Levin wrote:
> On Thu, Jan 02, 2020 at 01:31:22PM +0530, Naresh Kamboju wrote:
> > On Fri, 27 Dec 2019 at 23:17, Sasha Levin <sashal@xxxxxxxxxx> wrote:
> > >
> > > From: Eric Dumazet <edumazet@xxxxxxxxxx>
> > >
> > > [ Upstream commit 8dbd76e79a16b45b2ccb01d2f2e08dbf64e71e40 ]
> > >
> > > Michal Kubecek and Firo Yang did a very nice analysis of crashes
> > > happening in __inet_lookup_established().
> > >
> > > Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN
> > > (via a close()/socket()/listen() cycle) without a RCU grace period,
> > > I should not have changed listeners linkage in their hash table.
> > >
> > > They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt),
> > > so that a lookup can detect a socket in a hash list was moved in
> > > another one.
> > >
> > > Since we added code in commit d296ba60d8e2 ("soreuseport: Resolve
> > > merge conflict for v4/v6 ordering fix"), we have to add
> > > hlist_nulls_add_tail_rcu() helper.
> >
> > The kernel panic reported on all devices,
> > While running LTP syscalls accept* test cases on stable-rc-4.19 branch kernel.
> > This report log extracted from qemu_x86_64.
> >
> > Reverting this patch re-solved kernel crash.
>
> I'll drop it until we can look into what's happening here, thanks!

It was already discussed here:

http://lkml.kernel.org/r/CA+G9fYv3=oJSFodFp4wwF7G7_g5FWYRYbc4F0AMU6jyfLT689A@xxxxxxxxxxxxxx

and fixed version should be in 4.19, 4.14 and 4.9 stable branches now.

Michal Kubecek