Re: [PATCH] can: slcan: fix freed work crash

From: Marc Kleine-Budde
Date: Fri Dec 02 2022 - 10:27:35 EST


On 01.12.2022 08:34:26, Jiri Slaby (SUSE) wrote:
> The LTP test pty03 is causing a crash in slcan:
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 0 PID: 348 Comm: kworker/0:3 Not tainted 6.0.8-1-default #1 openSUSE Tumbleweed 9d20364b934f5aab0a9bdf84e8f45cfdfae39dab
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
> Workqueue: 0x0 (events)
> RIP: 0010:process_one_work (/home/rich/kernel/linux/kernel/workqueue.c:706 /home/rich/kernel/linux/kernel/workqueue.c:2185)
> Code: 49 89 ff 41 56 41 55 41 54 55 53 48 89 f3 48 83 ec 10 48 8b 06 48 8b 6f 48 49 89 c4 45 30 e4 a8 04 b8 00 00 00 00 4c 0f 44 e0 <49> 8b 44 24 08 44 8b a8 00 01 00 00 41 83 e5 20 f6 45 10 04 75 0e
> RSP: 0018:ffffaf7b40f47e98 EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff9d644e1b8b48 RCX: ffff9d649e439968
> RDX: 00000000ffff8455 RSI: ffff9d644e1b8b48 RDI: ffff9d64764aa6c0
> RBP: ffff9d649e4335c0 R08: 0000000000000c00 R09: ffff9d64764aa734
> R10: 0000000000000007 R11: 0000000000000001 R12: 0000000000000000
> R13: ffff9d649e4335e8 R14: ffff9d64490da780 R15: ffff9d64764aa6c0
> FS: 0000000000000000(0000) GS:ffff9d649e400000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000008 CR3: 0000000036424000 CR4: 00000000000006f0
> Call Trace:
> <TASK>
> worker_thread (/home/rich/kernel/linux/kernel/workqueue.c:2436)
> kthread (/home/rich/kernel/linux/kernel/kthread.c:376)
> ret_from_fork (/home/rich/kernel/linux/arch/x86/entry/entry_64.S:312)
>
> Apparently, the slcan's tx_work is freed while being scheduled. While
> slcan_netdev_close() (netdev side) calls flush_work(&sl->tx_work),
> slcan_close() (tty side) does not. So when the netdev is never set UP,
> but the tty is stuffed with bytes and forced to wakeup write, the work
> is scheduled, but never flushed.
>
> So add an additional flush_work() to slcan_close() to be sure the work
> is flushed under all circumstances.
>
> The Fixes commit below moved flush_work() from slcan_close() to
> slcan_netdev_close(). What was the rationale behind it? Maybe we can
> drop the one in slcan_netdev_close()?
>
> I see the same pattern in can327. So it perhaps needs the very same fix.
>
> Fixes: cfcb4465e992 ("can: slcan: remove legacy infrastructure")
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1205597
> Reported-by: Richard Palethorpe <richard.palethorpe@xxxxxxxx>
> Tested-by: Petr Vorel <petr.vorel@xxxxxxxx>
> Cc: Dario Binacchi <dario.binacchi@xxxxxxxxxxxxxxxxxxxx>
> Cc: Wolfgang Grandegger <wg@xxxxxxxxxxxxxx>
> Cc: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx>
> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> Cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> Cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> Cc: linux-can@xxxxxxxxxxxxxxx
> Cc: netdev@xxxxxxxxxxxxxxx
> Cc: stable@xxxxxxxxxxxxxxx
> Cc: Max Staudt <max@xxxxxxxxx>
> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@xxxxxxxxxx>

Added to linux-can,

Thanks,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature