Re: [PATCH v3] f2fs: add cur_reserved_blocks to support soft block reservation

From: Chao Yu
Date: Wed Oct 25 2017 - 23:27:11 EST


On 2017/10/26 11:07, Yunlong Song wrote:
> Yes, I agree with the soft semantic you introduce, it is too slow to
> increase cur_reserved_blocks only in
> dec_valid_block(,node)_count, e.g. if users want to set
> cur_reserved_blocks to 10G.

Yup,

>
> Then how about fix the initialization of cur_reserved_blocks in
> fs/f2fs/super.c as following:
> sbi->current_reserved_blocks = 0
> change to
> sbi->current_reserved_blocks = min(sbi->reserved_blocks,
> sbi->user_block_count - valid_user_blocks(sbi));

It will be necessary only if reserved_blocks is initialized with a non-zero value,
but now it has value of zero, so it's redundant...

What about adding this check if we initialize reserved_blocks with a non-zero default
value or value which may be configured with mount_option?

Thanks,

>
> On 2017/10/25 23:46, Chao Yu wrote:
>> On 2017/10/25 22:06, Yunlong Song wrote:
>>> Hi, Chao,
>>> Please see my comments below.
>>>
>>> On 2017/10/25 20:26, Chao Yu wrote:
>>>> On 2017/10/25 18:02, Yunlong Song wrote:
>>>>> ping...
>>>> I've replied in this thread, check your email list please, or you can check the
>>>> comments in below link:
>>>>
>>>> https://patchwork.kernel.org/patch/9909407/
>>>>
>>>> Anyway, see comments below.
>>>>
>>>>> On 2017/8/18 23:09, Yunlong Song wrote:
>>>>>> This patch adds cur_reserved_blocks to extend reserved_blocks sysfs
>>>>>> interface to be soft threshold, which allows user configure it exceeding
>>>>>> current available user space. To ensure there is enough space for
>>>>>> supporting system's activation, this patch does not set the reserved space
>>>>>> to the configured reserved_blocks value at once, instead, it safely
>>>>>> increase cur_reserved_blocks in dev_valid_block(,node)_count to only take
>>>>>> up the blocks which are just obsoleted.
>>>>>>
>>>>>> Signed-off-by: Yunlong Song <yunlong.song@xxxxxxxxxx>
>>>>>> Signed-off-by: Chao Yu <yuchao0@xxxxxxxxxx>
>>>>>> ---
>>>>>> Documentation/ABI/testing/sysfs-fs-f2fs | 3 ++-
>>>>>> fs/f2fs/f2fs.h | 13 +++++++++++--
>>>>>> fs/f2fs/super.c | 3 ++-
>>>>>> fs/f2fs/sysfs.c | 15 +++++++++++++--
>>>>>> 4 files changed, 28 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>>> index 11b7f4e..ba282ca 100644
>>>>>> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>>> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>>>> @@ -138,7 +138,8 @@ What: /sys/fs/f2fs/<disk>/reserved_blocks
>>>>>> Date: June 2017
>>>>>> Contact: "Chao Yu" <yuchao0@xxxxxxxxxx>
>>>>>> Description:
>>>>>> - Controls current reserved blocks in system.
>>>>>> + Controls current reserved blocks in system, the threshold
>>>>>> + is soft, it could exceed current available user space.
>>>>>> What: /sys/fs/f2fs/<disk>/gc_urgent
>>>>>> Date: August 2017
>>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>>> index 2f20b6b..84ccbdc 100644
>>>>>> --- a/fs/f2fs/f2fs.h
>>>>>> +++ b/fs/f2fs/f2fs.h
>>>>>> @@ -1041,6 +1041,7 @@ struct f2fs_sb_info {
>>>>>> block_t discard_blks; /* discard command candidats */
>>>>>> block_t last_valid_block_count; /* for recovery */
>>>>>> block_t reserved_blocks; /* configurable reserved blocks */
>>>>>> + block_t cur_reserved_blocks; /* current reserved blocks */
>>>>>> u32 s_next_generation; /* for NFS support */
>>>>>> @@ -1515,7 +1516,8 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi,
>>>>>> spin_lock(&sbi->stat_lock);
>>>>>> sbi->total_valid_block_count += (block_t)(*count);
>>>>>> - avail_user_block_count = sbi->user_block_count - sbi->reserved_blocks;
>>>>>> + avail_user_block_count = sbi->user_block_count -
>>>>>> + sbi->cur_reserved_blocks;
>>>>>> if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) {
>>>>>> diff = sbi->total_valid_block_count - avail_user_block_count;
>>>>>> *count -= diff;
>>>>>> @@ -1549,6 +1551,10 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi,
>>>>>> f2fs_bug_on(sbi, sbi->total_valid_block_count < (block_t) count);
>>>>>> f2fs_bug_on(sbi, inode->i_blocks < sectors);
>>>>>> sbi->total_valid_block_count -= (block_t)count;
>>>>>> + if (sbi->reserved_blocks &&
>>>>>> + sbi->reserved_blocks != sbi->cur_reserved_blocks)
>>>> It's redundent check here...
>>> I think in most cases, cur_reserved_blocks is equal to reserved_blocks, so we do not need to calculate min any more, otherwise,
>>> if reserved_blocks is not 0, it will calculate min and set current_reserved_blocks each time.
>> OK, IMO, in some condition, we can save dirtying cache line to decrease cache
>> line missing with that check.
>>
>>>>>> + sbi->cur_reserved_blocks = min(sbi->reserved_blocks,
>>>>>> + sbi->cur_reserved_blocks + count);
>>>>>> spin_unlock(&sbi->stat_lock);
>>>>>> f2fs_i_blocks_write(inode, count, false, true);
>>>>>> }
>>>>>> @@ -1695,7 +1701,7 @@ static inline int inc_valid_node_count(struct f2fs_sb_info *sbi,
>>>>>> spin_lock(&sbi->stat_lock);
>>>>>> valid_block_count = sbi->total_valid_block_count + 1;
>>>>>> - if (unlikely(valid_block_count + sbi->reserved_blocks >
>>>>>> + if (unlikely(valid_block_count + sbi->cur_reserved_blocks >
>>>>>> sbi->user_block_count)) {
>>>>>> spin_unlock(&sbi->stat_lock);
>>>>>> goto enospc;
>>>>>> @@ -1738,6 +1744,9 @@ static inline void dec_valid_node_count(struct f2fs_sb_info *sbi,
>>>>>> sbi->total_valid_node_count--;
>>>>>> sbi->total_valid_block_count--;
>>>>>> + if (sbi->reserved_blocks &&
>>>>>> + sbi->reserved_blocks != sbi->cur_reserved_blocks)
>>>> Checking low boundary is more safe here.
>>> I think cur_reserved_blocks can never be larger than reserved_blocks in any case. so min(reserved_blocks,
>>> cur_reserved_blocks +1) is same to cur_reserved_blocks++ when reserved_blocks != cur_reserved_blocks
>>> (which means reserved_blocks > cur_reserved_block )
>> Ditto.
>>
>>>>>> + sbi->cur_reserved_blocks++;
>>>>>> spin_unlock(&sbi->stat_lock);
>>>>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>>>> index 4c1bdcb..16a805f 100644
>>>>>> --- a/fs/f2fs/super.c
>>>>>> +++ b/fs/f2fs/super.c
>>>>>> @@ -957,7 +957,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>>>> buf->f_blocks = total_count - start_count;
>>>>>> buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>>>> buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
>>>>>> - sbi->reserved_blocks;
>>>>>> + sbi->cur_reserved_blocks;
>>>>>> avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>>>> @@ -2411,6 +2411,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>>>> le64_to_cpu(sbi->ckpt->valid_block_count);
>>>>>> sbi->last_valid_block_count = sbi->total_valid_block_count;
>>>>>> sbi->reserved_blocks = 0;
>>>>>> + sbi->cur_reserved_blocks = 0;
>>>>>> for (i = 0; i < NR_INODE_TYPE; i++) {
>>>>>> INIT_LIST_HEAD(&sbi->inode_list[i]);
>>>>>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>>>>>> index a1be5ac..75c37bb 100644
>>>>>> --- a/fs/f2fs/sysfs.c
>>>>>> +++ b/fs/f2fs/sysfs.c
>>>>>> @@ -104,12 +104,22 @@ static ssize_t features_show(struct f2fs_attr *a,
>>>>>> return len;
>>>>>> }
>>>>>> +static ssize_t f2fs_reserved_blocks_show(struct f2fs_attr *a,
>>>>>> + struct f2fs_sb_info *sbi, char *buf)
>>>>>> +{
>>>>>> + return snprintf(buf, PAGE_SIZE, "expected: %u\ncurrent: %u\n",
>>>>>> + sbi->reserved_blocks, sbi->cur_reserved_blocks);
>>>>>> +}
>>>>>> +
>>>>>> static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
>>>>>> struct f2fs_sb_info *sbi, char *buf)
>>>>>> {
>>>>>> unsigned char *ptr = NULL;
>>>>>> unsigned int *ui;
>>>>>> + if (a->struct_type == RESERVED_BLOCKS)
>>>>>> + return f2fs_reserved_blocks_show(a, sbi, buf);
>>>>>> +
>>>>>> ptr = __struct_ptr(sbi, a->struct_type);
>>>>>> if (!ptr)
>>>>>> return -EINVAL;
>>>>>> @@ -143,12 +153,13 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>>>>> #endif
>>>>>> if (a->struct_type == RESERVED_BLOCKS) {
>>>>>> spin_lock(&sbi->stat_lock);
>>>>>> - if ((unsigned long)sbi->total_valid_block_count + t >
>>>>>> - (unsigned long)sbi->user_block_count) {
>>>>>> + if (t > (unsigned long)sbi->user_block_count) {
>>>>>> spin_unlock(&sbi->stat_lock);
>>>>>> return -EINVAL;
>>>>>> }
>>>>>> *ui = t;
>>>>>> + if (t < (unsigned long)sbi->cur_reserved_blocks)
>>>>>> + sbi->cur_reserved_blocks = t;
>>>> No, for 't < cur_reserved_blocks' case, cur_reserved_blocks will out of update
>>>> even if there is enough free space. You know, for soft block resevation, we need
>>>> to reserve blocks as many as possible, making free space being zero suddenly is
>>>> possible.
>>> I do not understand why it is not safe to decrease cur_reserved_blocks, for example,
>>> if current cur_reserved_blocks is 100, now decrease it to 80, is there any problem?
>>> If 80 will make free space zero, how does 100 exist?
>>> And I do not think it is safe as following:
>>> *ui = t;
>>> + sbi->current_reserved_blocks = min(sbi->reserved_blocks,
>>> + sbi->user_block_count - valid_user_blocks(sbi));
>>>
>>> If user_block_count = 200, valid_user_blocks=150ï reserved_blocks = 100,
>>> then current_reserved_block = min(100,200-50) = 50, in this case, free space
>>> is suddenly becoming zero.
>> Free space becomes zero suddenly is safe, as I said before, I don't expect this
>> feature can be used in android, instead, it may be used in distributed storage
>> scenario, in where, once we configure soft_reserved_block making one server out
>> of free space, it's not critical issue to this system since we can move current
>> copy to another server which has enough free space.
>>
>> Secondly, as an global configuration, it's due to usage of administrator with
>> it, if there is critical application which is sensitive with free space,
>> administrator should make sure our reservation should not overload consuming free
>> space, which means soft reservation is not suitable.
>>
>>> To avoid this, I change the code to:
>>>
>>> + if (t < (unsigned long)sbi->cur_reserved_blocks)
>>> + sbi->cur_reserved_blocks = t;
>>>
>>> I think it is only safe to decrease the value of cur_reserved_blocks, and leave increase operation to
>>> dec_valid_block(,node)_count, it is safe to increase cur_reserved_blocks there.
>> For initialization of reserved_blocks, cur_reserved_blocks will always be zero
>> due to this check, and will be updated to close to reserved_blocks after block
>> allocation and deallocation of user, IMO, it's not looks reasonable to user.
>>
>> Anyway, it's due to how you define semantics of soft reservation, so what is your
>> understanding of it?
>>
>> Thanks,
>>
>>>> Thanks,
>>>>
>>>>>> spin_unlock(&sbi->stat_lock);
>>>>>> return count;
>>>>>> }
>>>> .
>>>>
>> .
>>
>