Re: [PATCH 3/6] f2fs: do not expose unwritten blocks to user by DIO

From: Jaegeuk Kim
Date: Fri Jan 07 2022 - 20:52:56 EST


On 01/07, Eric Biggers wrote:
> Hi Jaegeuk,
>
> On Tue, Jan 04, 2022 at 01:24:16PM -0800, Jaegeuk Kim wrote:
> > DIO preallocates physical blocks before writing data, but if an error occurrs
> > or power-cut happens, we can see block contents from the disk. This patch tries
> > to fix it by 1) turning to buffered writes for DIO into holes, 2) truncating
> > unwritten blocks from error or power-cut.
> >
> > Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
> > ---
> > fs/f2fs/data.c | 5 ++++-
> > fs/f2fs/f2fs.h | 5 +++++
> > fs/f2fs/file.c | 27 ++++++++++++++++++---------
> > fs/f2fs/inode.c | 8 ++++++++
> > 4 files changed, 35 insertions(+), 10 deletions(-)
>
> Unfortunately, this patch doesn't completely fix the uninitialized data
> exposure. The problem is that it only makes DIO writes fall back to buffered
> writes for holes, and not for reserved blocks (NEW_ADDR). f2fs's reserved
> blocks are *not* the same as the unwritten extents that other filesystems have;
> f2fs's reserved blocks have to be turned into regular blocks before DIO can
> write to them. That immediately exposes them to concurrent reads (at least
> buffered reads, but I think DIO reads too).

Isn't it resolved by i_size which gives the written blocks only?

>
> This can be reproduced using the aiodio_sparse program from LTP, as follows:
>
> aiodio_sparse -i 4 -a 8k -w 1024k -s 4096k -n 6
>
> This was reported at
> https://lore.kernel.org/r/20211226132851.GC34518@xsang-OptiPlex-9020 as a
> regression from the commit "f2fs: use iomap for direct I/O", but it actually
> failed before too. Note that it's nondeterministic; writing random data to the
> block device before creating the filesystem helps with reproduction.
>
> I see only three possible solutions:
>
> * Make DIO writes to reserved blocks fall back to buffered writes, just like
> writes to holes. This would mean that a file would have to be written to
> before direct writes would work; fallocate() wouldn't be enough. Note that my
> patch https://lore.kernel.org/r/20210728015154.171507-1-ebiggers@xxxxxxxxxx
> would have done this.
>
> * Or, change the f2fs on-disk format to support unwritten extents.
>
> * Or, split up block allocation into two parts, so that blocks could be
> preliminarily allocated and not exposed to reads yet. This would be like
> Ted's suggestion here: https://lore.kernel.org/r/YQS5eBljtztWwOFE@xxxxxxx
>
> - Eric