AW: [syzbot] [ext4?] general protection fault in hrtimer_nanosleep

From: carsten.schmid@xxxxxxxxxxx
Date: Fri Nov 03 2023 - 07:18:03 EST


Hi,

> [ 125.919060][ C0] BUG: KASAN: stack-out-of-bounds in rb_next+0x10a/0x130
> [ 125.921169][ C0] Read of size 8 at addr ffffc900048e7c60 by task kworker/0:1/9
> [ 125.923235][ C0]
> [ 125.923243][ C0] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc7-syzkaller-00142-g888cf78c29e2 #0
> [ 125.924546][ C0] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 125.926915][ C0] Workqueue: events nsim_dev_trap_report_work
> [ 125.929333][ C0]
> [ 125.929341][ C0] Call Trace:
> [ 125.929350][ C0] <IRQ>
> [ 125.929356][ C0] dump_stack_lvl+0xd9/0x1b0
> [ 125.931302][ C0] print_report+0xc4/0x620
> [ 125.932115][ C0] ? __virt_addr_valid+0x5e/0x2d0
> [ 125.933194][ C0] kasan_report+0xda/0x110
> [ 125.934814][ C0] ? rb_next+0x10a/0x130
> [ 125.936521][ C0] ? rb_next+0x10a/0x130
> [ 125.936544][ C0] rb_next+0x10a/0x130
> [ 125.936565][ C0] timerqueue_del+0xd4/0x140
> [ 125.936590][ C0] __remove_hrtimer+0x99/0x290
> [ 125.936613][ C0] __hrtimer_run_queues+0x55b/0xc10
> [ 125.936638][ C0] ? enqueue_hrtimer+0x310/0x310
> [ 125.936659][ C0] ? ktime_get_update_offsets_now+0x3bc/0x610
> [ 125.936688][ C0] hrtimer_interrupt+0x31b/0x800
> [ 125.936715][ C0] __sysvec_apic_timer_interrupt+0x105/0x3f0
> [ 125.936737][ C0] sysvec_apic_timer_interrupt+0x8e/0xc0
> [ 125.936755][ C0] </IRQ>
> [ 125.936759][ C0] <TASK>

i had sporadic similar issues with 4.14 kernels (several maturities, .147 .212 .247 .300) in the past 5 years where stack looked quite similar:

[ 432.041880] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
[ 432.048697] Modules linked in: intel_tfm_governor ecryptfs coretemp i2c_i801 sbi_apl snd_soc_skl sdw_cnl snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress snd_soc_skl_ipc xhci_pci xhci_hcd sdw_bus crc8 ahci snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core libahci snd_hda_core libata snd_pcm usbcore mei_me snd_timer scsi_mod usb_common snd mei soundcore fuse 8021q inap560t(O) i915 video backlight intel_gtt i2c_algo_bit drm_kms_helper drm firmware_class igb_avb(O) ptp hwmon spi_pxa2xx_platform pps_core
[ 432.099672] CPU: 3 PID: 5729 Comm: dlt_segmented Tainted: G U O 4.14.244-apl #1
[ 432.108909] task: 00000000504d2561 task.stack: 000000007d0046fd
[ 432.115530] RIP: 0010:rb_erase_cached+0x31/0x3b0
[ 432.120683] RSP: 0018:ffffa31d84f77d40 EFLAGS: 00010006
[ 432.126517] RAX: 0000000000000001 RBX: ffffa31d84f77e30 RCX: 0000000000000000
[ 432.134485] RDX: 0000000000000000 RSI: ffff9ed077c1bb10 RDI: ffffa31d84f77e30
[ 432.142456] RBP: ffffa31d84f77d40 R08: ffffa31d84f77e30 R09: 0000a31d80a77c90
[ 432.150426] R10: ffff9ed077c1bee0 R11: 0000000000000400 R12: ffff9ed077c1bb10
[ 432.158394] R13: 0000000000000000 R14: ffff9ed077c1bac0 R15: 0000000000000000
[ 432.166366] FS: 00007ff718cce700(0000) GS:ffff9ed077d80000(0000) knlGS:0000000000000000
[ 432.175403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 432.181819] CR2: 00007ff7182ca3e4 CR3: 000000026175c000 CR4: 00000000003406a0
[ 432.189790] Call Trace:
[ 432.192526] timerqueue_del+0x1d/0x40
[ 432.196617] __remove_hrtimer+0x37/0x70
[ 432.200898] hrtimer_try_to_cancel+0xa0/0x120
[ 432.205769] do_nanosleep+0xa9/0x180
[ 432.209765] ? kfree+0x169/0x180
[ 432.213370] hrtimer_nanosleep+0xbb/0x150
[ 432.217849] ? hrtimer_init+0x110/0x110
[ 432.222134] SyS_nanosleep+0x6d/0xa0
[ 432.226126] do_syscall_64+0x79/0x350
[ 432.230218] entry_SYSCALL_64_after_hwframe+0x41/0xa6
[ 432.235861] RIP: 0033:0x7ff7199b7240
[ 432.239850] RSP: 002b:00007ff718ccddf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000023
[ 432.248309] RAX: ffffffffffffffda RBX: 00007ff718ccde20 RCX: 00007ff7199b7240
[ 432.256282] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ff718ccde20
[ 432.264252] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 432.272222] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe333ec72e
[ 432.280190] R13: 00007ffe333ec72f R14: 0000000000802000 R15: 00007ffe333ec730
[ 432.288161] Code: 89 f8 4c 8b 4f 08 48 89 e5 4c 8b 57 10 74 0a 48 3b 7e 08 0f 84 a6 02 00 00 4d 85 d2 0f 84 28 02 00 00 4d 85 c9 0f 84 03 02 00 00 <49> 8b 51 10 4c 89 cf 4c 89 c8 48 85 d2 75 0b e9 65 02 00 00 48
[ 432.309346] RIP: rb_erase_cached+0x31/0x3b0 RSP: ffffa31d84f77d40

Looks like it's worth to dig inside that.
Unfortunately i wasn't able to reproduce this, and i'm still not. So i can't help digging but wanted to tell that this seems not to be related to a specific kernel ....

Thanks
Carsten
>>
>> Thanks,
>>
>> tglx
>>