Re: general protection fault in fib6_purge_rt

From: Xin Long
Date: Wed Mar 20 2019 - 15:08:57 EST


On Thu, Mar 21, 2019 at 12:54 AM Jon Maloy <jon.maloy@xxxxxxxxxxxx> wrote:
>
>
>
> > -----Original Message-----
> > From: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
> > Sent: 20-Mar-19 17:41
> > To: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > Cc: syzbot <syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx>;
> > davem@xxxxxxxxxxxxx; kuznet@xxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; syzkaller-
> > bugs@xxxxxxxxxxxxxxxx; tipc-discussion@xxxxxxxxxxxxxxxxxxxxx;
> > ying.xue@xxxxxxxxxxxxx; yoshfuji@xxxxxxxxxxxxxx
> > Subject: Re: general protection fault in fib6_purge_rt
> >
> > On Wed, Mar 20, 2019 at 4:59 PM Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > wrote:
> > >
> > > This one identifies the same culprit as
> > syzbot+9d4c12bfd45a58738d0a@xxxxxxxxxxxxxxxxxxxxxxxxx, but points to a
> > different bug.
> > > That bug has also been fixed, in commit adba75be0d23 ("tipc: fix lockdep
> > warning when reinitilaizing sockets"), applied in 4.20 but not present in 4.16, -
> > the source of the dump.
> > > Once again, a dump from 4.20/5.0 might be a help.
Hi, Jon,

I was running the reproducer against the net.git kernel which includes
commit adba75be0d23.

Another panic showed up:

[ 156.086487] ==================================================================
[ 156.088228] BUG: KASAN: use-after-free in
tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.089740] Read of size 8 at addr ffff88802fdb1be8 by task swapper/1/0
[ 156.091120]
[ 156.091471] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0.test.syz #257
[ 156.092873] Hardware name: Red Hat KVM, BIOS seabios-1.7.5-8.el7 04/01/2014
[ 156.094315] Call Trace:
[ 156.094844] <IRQ>
[ 156.095306] dump_stack+0x7c/0xc0
[ 156.096040] ? tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.097346] print_address_description+0x65/0x22e
[ 156.098360] ? tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.099408] ? tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.100445] kasan_report.cold.3+0x37/0x7a
[ 156.101348] ? tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.102402] tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.103641] ? tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc]
[ 156.104830] ? __lock_is_held+0xb4/0x140
[ 156.105669] ? call_timer_fn+0xd1/0x610
[ 156.106517] call_timer_fn+0x19a/0x610
[ 156.107342] ? tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc]
[ 156.108538] ? timer_fixup_init+0x30/0x30
[ 156.109411] ? _raw_spin_unlock_irq+0x29/0x40
[ 156.110343] ? tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc]
[ 156.111545] ? tipc_disc_msg_xmit.isra.19+0x180/0x180 [tipc]
[ 156.112749] run_timer_softirq+0xb51/0x1090
[ 156.113656] ? add_timer+0x8d0/0x8d0
[ 156.114433] ? kvm_sched_clock_read+0x14/0x30
[ 156.115355] ? sched_clock+0x5/0x10
[ 156.116124] __do_softirq+0x236/0xa1c
[ 156.116943] irq_exit+0x281/0x2d0
[ 156.117657] smp_apic_timer_interrupt+0x172/0x5d0
[ 156.118658] apic_timer_interrupt+0xf/0x20


I think it's caused by that d->timer wasn't deleted after the netns has been
destroyed, and tipc_disc_timeout() still used d->net that has been freed.

I looked at the __net_exit path, it should have been done by:
tipc_exit_net() ->
tipc_net_stop()->
tipc_bearer_stop()->
bearer_disable()->
tipc_disc_delete()->
del_timer_sync(&d->timer)

but because of if (!self), it returned in tipc_net_stop().

It seems to me that whether to do tipc_bearer/node_stop() for netns or not
shouldn't depend on tipc_net(net)->node_addr.
Can we just remove that if(!self) from tipc_net_stop() to fix it?
and also seems tipc_nametbl_stop() will do the clean job for nametbl,
should tipc_nametbl_withdraw() also be removed from tipc_net_stop()?

diff --git a/net/tipc/net.c b/net/tipc/net.c
index f076edb..3647984 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -163,12 +163,6 @@ void tipc_sched_net_finalize(struct net *net, u32 addr)

void tipc_net_stop(struct net *net)
{
- u32 self = tipc_own_addr(net);
-
- if (!self)
- return;
-
- tipc_nametbl_withdraw(net, TIPC_CFG_SRV, self, self, self);
rtnl_lock();
tipc_bearer_stop(net);
tipc_node_stop(net);

> >
> >
> > Looking at the bisection log maybe this reproducer triggers multiple kernel
> > bugs.
>
> I think so.
>
> > All crashes including the latest ones and other info are always available on
> > the dashboard.
>
> Looking at the latest dashboard reports, I don't see anything that points to TIPC.
>
> ///jon
>
>
> >
> >
> > > ///jon
> > >
> > >
> > > > -----Original Message-----
> > > > From: syzbot
> > <syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx>
> > > > Sent: 18-Mar-19 08:28
> > > > To: davem@xxxxxxxxxxxxx; Jon Maloy <jon.maloy@xxxxxxxxxxxx>;
> > > > kuznet@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > > > netdev@xxxxxxxxxxxxxxx; syzkaller-bugs@xxxxxxxxxxxxxxxx; tipc-
> > > > discussion@xxxxxxxxxxxxxxxxxxxxx; ying.xue@xxxxxxxxxxxxx;
> > > > yoshfuji@linux- ipv6.org
> > > > Subject: Re: general protection fault in fib6_purge_rt
> > > >
> > > > syzbot has bisected this bug to:
> > > >
> > > > commit 52dfae5c85a4c1078e9f1d5e8947d4a25f73dd81
> > > > Author: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
> > > > Date: Thu Mar 22 19:42:52 2018 +0000
> > > >
> > > > tipc: obtain node identity from interface by default
> > > >
> > > > bisection log:
> > https://syzkaller.appspot.com/x/bisect.txt?x=1116d2a3200000
> > > > start commit: 52dfae5c tipc: obtain node identity from interface by
> > defa..
> > > > git tree: linux-next
> > > > final crash:
> > https://syzkaller.appspot.com/x/report.txt?x=1316d2a3200000
> > > > console output:
> > > > https://syzkaller.appspot.com/x/log.txt?x=1516d2a3200000
> > > > kernel config:
> > > > https://syzkaller.appspot.com/x/.config?x=c8b6073d992e8217
> > > > dashboard link:
> > > > https://syzkaller.appspot.com/bug?extid=a25307ad099309f1c2b9
> > > > syz repro:
> > https://syzkaller.appspot.com/x/repro.syz?x=16b2c56f200000
> > > > C reproducer:
> > https://syzkaller.appspot.com/x/repro.c?x=13b8890b200000
> > > >
> > > > Reported-by: syzbot+a25307ad099309f1c2b9@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > Fixes: 52dfae5c ("tipc: obtain node identity from interface by
> > > > default")
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > Groups "syzkaller-bugs" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an
> > email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxxx
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/syzkaller-
> > bugs/BL0PR1501MB20039998B662DCC11E2B38D79A410%40BL0PR1501MB200
> > 3.namprd15.prod.outlook.com.
> > > For more options, visit https://groups.google.com/d/optout.