Re: [PATCH v5 02/11] block: Block Device Filtering Mechanism

From: Sergei Shtepa
Date: Tue Jul 18 2023 - 12:34:21 EST




On 7/18/23 14:32, Yu Kuai wrote:
> Subject:
> Re: [PATCH v5 02/11] block: Block Device Filtering Mechanism
> From:
> Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>
> Date:
> 7/18/23, 14:32
>
> To:
> Sergei Shtepa <sergei.shtepa@xxxxxxxxx>, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>, axboe@xxxxxxxxx, hch@xxxxxxxxxxxxx, corbet@xxxxxxx, snitzer@xxxxxxxxxx
> CC:
> viro@xxxxxxxxxxxxxxxxxx, brauner@xxxxxxxxxx, dchinner@xxxxxxxxxx, willy@xxxxxxxxxxxxx, dlemoal@xxxxxxxxxx, linux@xxxxxxxxxxxxxx, jack@xxxxxxx, ming.lei@xxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-doc@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, Donald Buczek <buczek@xxxxxxxxxxxxx>, "yukuai (C)" <yukuai3@xxxxxxxxxx>
>
>
> Hi,
>
> 在 2023/07/18 19:25, Sergei Shtepa 写道:
>> Hi.
>>
>> On 7/18/23 03:37, Yu Kuai wrote:
>>> Subject:
>>> Re: [PATCH v5 02/11] block: Block Device Filtering Mechanism
>>> From:
>>> Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>
>>> Date:
>>> 7/18/23, 03:37
>>>
>>> To:
>>> Sergei Shtepa <sergei.shtepa@xxxxxxxxx>, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx>, axboe@xxxxxxxxx, hch@xxxxxxxxxxxxx, corbet@xxxxxxx, snitzer@xxxxxxxxxx
>>> CC:
>>> viro@xxxxxxxxxxxxxxxxxx, brauner@xxxxxxxxxx, dchinner@xxxxxxxxxx, willy@xxxxxxxxxxxxx, dlemoal@xxxxxxxxxx, linux@xxxxxxxxxxxxxx, jack@xxxxxxx, ming.lei@xxxxxxxxxx, linux-block@xxxxxxxxxxxxxxx, linux-doc@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, Donald Buczek <buczek@xxxxxxxxxxxxx>, "yukuai (C)" <yukuai3@xxxxxxxxxx>
>>>
>>>
>>> Hi,
>>>
>>> 在 2023/07/17 22:39, Sergei Shtepa 写道:
>>>>
>>>>
>>>> On 7/11/23 04:02, Yu Kuai wrote:
>>>>> bdev_disk_changed() is not handled, where delete_partition() and
>>>>> add_partition() will be called, this means blkfilter for partiton will
>>>>> be removed after partition rescan. Am I missing something?
>>>>
>>>> Yes, when the bdev_disk_changed() is called, all disk block devices
>>>> are deleted and new ones are re-created. Therefore, the information
>>>> about the attached filters will be lost. This is equivalent to
>>>> removing the disk and adding it back.
>>>>
>>>> For the blksnap module, partition rescan will mean the loss of the
>>>> change trackers data. If a snapshot was created, then such
>>>> a partition rescan will cause the snapshot to be corrupted.
>>>>
>>>
>>> I haven't review blksnap code yet, but this sounds like a problem.
>>
>> I can't imagine a case where this could be a problem.
>> Partition rescan is possible only if the file system has not been
>> mounted on any of the disk partitions. Ioctl BLKRRPART will return
>> -EBUSY. Therefore, during normal operation of the system, rescan is
>> not performed.
>> And if the file systems have not been mounted, it is possible that
>> the disk partition structure has changed or the disk in the media
>> device has changed. In this case, it is better to detach the
>> filter, otherwise it may lead to incorrect operation of the module.
>>
>> We can add prechange/postchange callback functions so that the
>> filter can track rescan process. But at the moment, this is not
>> necessary for the blksnap module.
>
> So you mean that blkfilter is only used for the case that partition
> is mounted? (Or you mean that partition is opened)
>
> Then, I think you mean that filter should only be used for the partition
> that is opended? Otherwise, filter can be gone at any time since
> partition rescan can be gone.
>
> //user
> 1. attach filter
>         // other context rescan partition
> 2. mount fs
> // user will found filter is gone.

Mmm... The fact is that at the moment the user of the filter is the
blksnap module. There are no other filter users yet. The blksnap module
solves the problem of creating snapshots, primarily for backup purposes.
Therefore, the main use case is to attach a filter for an already running
system, where all partitions are marked up, file systems are mounted.

If the server is being serviced, during which the disk is being
re-partitioned, then disabling the filter is normal. In this case, the
change tracker will be reset, and at the next backup, the filter will be
attached again.

But if I were still solving the problem of saving the filter when rescanning,
then it is necessary to take into account the UUID and name of the partition
(struct partition_meta_info). It is unacceptable that due to a change in the
structure of partitions, the filter is attached to another partition by mistake.
The changed() callback would also be good to add so that the filter receives
a notification that the block device has been updated.

But I'm not sure that this should be done, since if some code is not used in
the kernel, then it should not be in the kernel.

>
> Thanks,
> Kuai
>
>>
>> Therefore, I will refrain from making changes for now.
>>
>>>
>>> possible solutions I have in mind:
>>>
>>> 1. Store blkfilter for each partition from bdev_disk_changed() before
>>> delete_partition(), and add blkfilter back after add_partition().
>>>
>>> 2. Store blkfilter from gendisk as a xarray, and protect it by
>>> 'open_mutex' like 'part_tbl', block_device can keep the pointer to
>>> reference blkfilter so that performance from fast path is ok, and the
>>> lifetime of blkfiter can be managed separately.
>>>
>>>> There was an idea to do filtering at the disk level,
>>>> but I abandoned it.
>>>> .
>>>>
>>> I think it's better to do filtering at the partition level as well.
>>>
>>> Thanks,
>>> Kuai
>>>
>> .
>>
>