Re: [PATCH 3/4] iomap: don't increase i_size if it's not a write operation

From: Zhang Yi
Date: Wed Mar 13 2024 - 03:09:34 EST


On 2024/3/13 0:24, Darrick J. Wong wrote:
> On Tue, Mar 12, 2024 at 08:59:15PM +0800, Zhang Yi wrote:
>> On 2024/3/11 23:48, Darrick J. Wong wrote:
>>> On Mon, Mar 11, 2024 at 08:22:54PM +0800, Zhang Yi wrote:
>>>> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>>>>
[...]
>>>> @@ -927,6 +908,24 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
>>>> copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
>>>> status = iomap_write_end(iter, pos, bytes, copied, folio);
>>>>
>>>> + /*
>>>> + * Update the in-memory inode size after copying the data into
>>>> + * the page cache. It's up to the file system to write the
>>>> + * updated size to disk, preferably after I/O completion so that
>>>> + * no stale data is exposed.
>>>> + */
>>>> + old_size = iter->inode->i_size;
>>>> + if (pos + status > old_size) {
>>>> + i_size_write(iter->inode, pos + status);
>>>> + iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>>>> + }
>>>> + __iomap_put_folio(iter, pos, status, folio);
>>>
>>> Why is it necessary to hoist the __iomap_put_folio calls from
>>> iomap_write_end into iomap_write_iter, iomap_unshare_iter, and
>>> iomap_zero_iter? None of those functions seem to use it, and it makes
>>> more sense to me that iomap_write_end releases the folio that
>>> iomap_write_begin returned.
>>>
>>
>> Because we have to update i_size before __iomap_put_folio() in
>> iomap_write_iter(). If not, once we unlock folio, it could be raced
>> by the backgroud write back which could start writing back and call
>> folio_zero_segment() (please see iomap_writepage_handle_eof()) to
>> zero out the valid data beyond the not updated i_size. So we
>> have to move out __iomap_put_folio() out together with the i_size
>> updating.
>
> Ahah. Please make a note of that in the comment for dunces like me.
>
> /*
> * Update the in-memory inode size after copying the data into
> * the page cache. It's up to the file system to write the
> * updated size to disk, preferably after I/O completion so that
> * no stale data is exposed. Only once that's done can we
> * unlock and release the folio.
> */
>

Sure.