Re: Multitude of dst obsolescense race conditions

From: dormando
Date: Wed May 14 2014 - 14:01:14 EST


> On Wed, May 14, 2014, at 2:57, dormando wrote:
> > Given a machine with frequently changing routes (ie; a router with an
> > active internet BGP table and multiple interfaces), there're at least
> > several places where obsolete dst's are handled improperly. If I pause
> > the
> > route changes, the crashes appear to stop. This first one has a crash
> > utility we've made, so I was able to more quickly find a patch and test
> > it. The others take time to reproduce.
> >
> > I'm testing against 3.10.39, but I think if these were fixed they'd be
> > backported to stable? I've also had recent 3.12's running that have
> > crashed in the same spots. Anyway correct me if I'm wrong...
>
> Just a hunch:
> You use macvlan? Could you somehow try without?
> Maybe... some ref overflow? (You could add some testing code in dst_hold
> with atomic_inc_return and WARN_ON).
>
> dst_release already contains such a check, so I am not sure at all if
> that could happen.
>
> Bye,
>
> Hannes
>

We've seen the crashes with macvlan removed. Don't think I've explicitly
removed it recently or for the udp crash, but I'm sorta doubting that'd
make a difference.

and yeah, pretty weird right? it's like the RCU isn't working..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/