Re: [PATCH RFC] RDMA/rtrs: Don't call kobject_del for srv_path->kobj

From: Guoqing Jiang
Date: Wed Feb 01 2023 - 23:09:54 EST




On 2/2/23 11:42, lizhijian@xxxxxxxxxxx wrote:
Thanks your reviewing @Guoqing

Thanks for the fix, it would be better to comment that why kobject_del can't
be called here.
commmit f7452a7e96c1 ("RDMA/rtrs-srv: fix memory leak by missing kobject free")
memtioned a free_sess(), but i didn't get it exactly(no such function ?). So i have
not sorted it out.

It was renamed to free_path I believe.

And hold on,with below change, kernel gets another WARNING.
I'm still working on it.

[ 125.034407] =========================
[ 125.035490] WARNING: held lock freed!
[ 125.036640] 6.2.0-rc6-roce-flush+ #53 Not tainted
[ 125.038088] -------------------------
[ 125.039143] kworker/0:3/63 is freeing memory ffff90bd01eaa000-ffff90bd01eaa7ff, with a lock still held there!
[ 125.041660] ffff90bd01eaa118 (&srv->paths_mutex){+.+.}-{3:3}, at: rtrs_srv_destroy_once_sysfs_root_folders.isra.0+0x24/0x60 [rtrs_server]
[ 125.044600] 3 locks held by kworker/0:3/63:
[ 125.046091] #0: ffff90bd05f8ed38 ((wq_completion)rtrs_server_wq){+.+.}-{0:0}, at: process_one_work+0x1f4/0x590
[ 125.048876] #1: ffff9ff4c085fe78 ((work_completion)(&srv_path->close_work)){+.+.}-{0:0}, at: process_one_work+0x1f4/0x590
[ 125.051789] #2: ffff90bd01eaa118 (&srv->paths_mutex){+.+.}-{3:3}, at: rtrs_srv_destroy_once_sysfs_root_folders.isra.0+0x24/0x60 [rtrs_server]
[ 125.055167]
[ 125.055167] stack backtrace:
[ 125.057979] CPU: 0 PID: 63 Comm: kworker/0:3 Kdump: loaded Not tainted 6.2.0-rc6-roce-flush+ #53
[ 125.060201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 125.063156] Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
[ 125.065031] Call Trace:
[ 125.065905] <TASK>
[ 125.066733] dump_stack_lvl+0x45/0x5d
[ 125.068003] debug_check_no_locks_freed.cold+0x77/0x7c
[ 125.069565] ? device_release+0x34/0x90
[ 125.070814] __kmem_cache_free+0x74/0x190
[ 125.072059] device_release+0x34/0x90
[ 125.073203] kobject_put+0x8b/0x1d0
[ 125.074385] rtrs_srv_destroy_once_sysfs_root_folders.isra.0+0x35/0x60 [rtrs_server]
[ 125.076456] rtrs_srv_close_work+0x21/0x270 [rtrs_server]
[ 125.078097] process_one_work+0x274/0x590
[ 125.079372] worker_thread+0x4f/0x3d0
[ 125.080508] ? __pfx_worker_thread+0x10/0x10
[ 125.081751] kthread+0xe7/0x110
[ 125.082778] ? __pfx_kthread+0x10/0x10
[ 125.083894] ret_from_fork+0x2c/0x50
[ 125.084991] </TASK>

Suppose it also happened during run your script, I guess it might be some
racy conditions.

$ while true;
do
echo "sessname=foo path=ip:<ip address> device_path=/dev/nvme0n1" > /sys/devices/virtual/rnbd-client/ctl/map_device
echo "normal" > /sys/block/rnbd0/rnbd/unmap_device
done


Or does it disappear after revert 6af4609c18b3? If so, we can revert it
first.

Thanks,
Guoqing