On Thu, Nov 02 2023 at 18:49, Ben Greear wrote:
And here is resulting splat from wireless-next tree I've been
debugging.
Note the subsequent splats from slub are due to some memory poisoning, for
one reason or another. Maybe slub changes should not be included in this patch, not
sure if it can provide useful info in other cases though.
If I understand this correctly, then it appears the bug is related to
the pps driver.
16140 Nov 02 17:28:25 ct523c-2103 kernel: ODEBUG: debugobjects: debug_obj allocated at:
16141 Nov 02 17:28:25 ct523c-2103 kernel: init_timer_key+0x24/0x160
16142 Nov 02 17:28:25 ct523c-2103 kernel: kobject_put+0x14f/0x190
16143 Nov 02 17:28:25 ct523c-2103 kernel: pps_device_destruct+0x26/0xb0
16144 Nov 02 17:28:25 ct523c-2103 kernel: device_release+0x57/0x100
16145 Nov 02 17:28:25 ct523c-2103 kernel: kobject_delayed_cleanup+0xdf/0x140
16146 Nov 02 17:28:25 ct523c-2103 kernel: process_one_work+0x475/0x920
16147 Nov 02 17:28:25 ct523c-2103 kernel: worker_thread+0x38a/0x680
Can you please provide proper kernel dmesg output next time instead of
this mess?
ODEBUG: free active (active state 0) object: ffff888181c029a0 object type: timer_list hint: kobject_delayed_cleanup+0x0/0x140
WARNING: CPU: 1 PID: 104 at lib/debugobjects.c:549 debug_print_object+0xf0/0x170
CPU: 1 PID: 104 Comm: kworker/1:10 Tainted: G W 6.6.0-rc7+ #17
Workqueue: events kobject_delayed_cleanup
RIP: 0010:debug_print_object+0xf0/0x170
debug_check_no_obj_freed+0x261/0x2b0
__kmem_cache_free+0x185/0x200
device_release+0x57/0x100
kobject_delayed_cleanup+0xdf/0x140
process_one_work+0x475/0x920
worker_thread+0x38a/0x680
So what happens is:
pps_unregister_cdev()
device_destroy()
put_device()
device_unregister()
device_del()
put_device() <- Drops final reference to dev->kobj
schedule_delayed_work()
worker thread:
kobject_delayed_cleanup()
device_release()
pps_device_destruct()
cdev_del(&pps->cdev)
kobject_put(&cdev->kobj) <- Drops final reference
schedule_delayed_work()
init_timer(&cdev->kobj.release.timer);
start_timer();
...
kfree(dev);
kfree(pps); <- Debug object detects the active timer to be freed
because cdev and its kobject are embedded in
struct pps_device.
pps_device_destruct() is unfortunately not on the call trace of the
debug objects splat anymore stack because kfree(pps) is a tail call.
So yes, that collected stacktrace is helpful.