Re: [PATCH RFC 5/6] fs: xfs: iomap atomic write support

From: John Garry
Date: Wed Feb 14 2024 - 07:14:18 EST

Not sure why we care about the file position, it's br_startblock that
gets passed into the bio, not br_startoff.

We just want to ensure that the length of the write is valid w.r.t. to the
offset within the extent, and br_startoff would be the offset within the
aligned extent.

Yes, I understand what br_startoff is, but this doesn't help me
understand why this code is necessary. Let's say you have a device that
supports untorn writes of 16k in length provided the LBA of the write
command is also aligned to 16k, and the fs has 4k blocks.

Userspace issues an 16k untorn write at offset 13k in the file, and gets
this mapping:

[startoff: 13k, startblock: 16k, blockcount: 16k]

Why should this IO be rejected?

It's rejected as it does not follow the rules.

The physical space extent satisfies the
alignment requirements of the underlying device, and the logical file
space extent does not need aligning at all.

Sure. In this case, we can produce a single BIO and the underlying HW may be able to handle this atomically.

The point really is that we want a consistent userspace experience. We say that the write 'must' be naturally aligned, not 'should' be.

It's not really useful to the user if sometimes a write passes and sometimes it fails by chance of how the extents happen to be laid out.

Furthermore, in this case, what should the user do if this write at 13K offset fails as the 16K of data straddles 2x extents? They asked for 16K written at offset 13K and they want it done atomically - there is nothing which the FS can do to help. If they don't really need 16K written atomically, then better just do a regular write, or write individual chunks atomically.