Re: [PATCH -next] block: fix blktrace debugfs entries leak

From: Christoph Hellwig
Date: Thu May 11 2023 - 11:28:17 EST


On Thu, May 11, 2023 at 02:56:33PM +0800, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> Commit 99d055b4fd4b ("block: remove per-disk debugfs files in
> blk_unregister_queue") moves blk_trace_shutdown() from
> blk_release_queue() to blk_unregister_queue(), this is safe if blktrace
> is created through sysfs, however, there are some regression in corner
> cases:
>
> 1) for scsi, passthrough io can still be issued after del_gendisk, and
> blktrace debugfs entries will be removed immediately after
> del_gendisk(), therefor passthrough io can't be tracked and blktrace
> will complain:
>
> failed read of /sys/kernel/debug/block/sdb/trace0: 5/Input/output error

But that is the right thing. The only thing that has a name is the
gendisk and it is gone at this point. Leaking the debugfs entries
that are named after, and ultimatively associated with the gendisk
(even if the code is still a bit confused about this) will create a lot
of trouble for us.

> 2) blktrace can still be enabled after del_gendisk() through ioctl if the
> disk is opened before del_gendisk(), and if blktrace is not shutdown
> through ioctl before closing the disk, debugfs entries will be
> leaked.

Yes.

> It seems 1) is not important, while 2) needs to be fixed apparently.
>
> Fix this problem by shutdown blktrace in blk_free_queue(),
> disk_release() is not used because scsi sg support blktrace without
> gendisk, and this is safe because queue is not freed yet, and
> blk_trace_shutdown() is reentrant.

I think disk_release is the right place for "normal" blktrace. The
odd cdev based blktrace for /dev/sg will need separate handling.
To be honest I'm not even sure how /dev/sg based passthrough is
even supposed to work in practice, but I'll need to spend some more
time to familarize myself with it.