Re: [PATCH 1/3] ext3: Fix data / filesystem corruption when write fails to copy data

From: saeed bishara
Date: Wed Dec 09 2009 - 10:42:19 EST


Hi,
I came a cross data corruption bug when using ext3, this patch fixed
it. the bug exists in 2.6.31 and 32.
saeed


On Wed, Dec 2, 2009 at 9:16 PM, Jan Kara <jack@xxxxxxx> wrote:
> When ext3_write_begin fails after allocating some blocks or
> generic_perform_write fails to copy data to write, we truncate blocks already
> instantiated beyond i_size. Although these blocks were never inside i_size, we
> have to truncate pagecache of these blocks so that corresponding buffers get
> unmapped. Otherwise subsequent __block_prepare_write (called because we are
> retrying the write) will find the buffers mapped, not call ->get_block, and
> thus the page will be backed by already freed blocks leading to filesystem and
> data corruption.
>
> CC: linux-ext4@xxxxxxxxxxxxxxx
> Reported-by: James Y Knight <foom@xxxxxxxx>
> Signed-off-by: Jan Kara <jack@xxxxxxx>
> ---
> Âfs/ext3/inode.c | Â 18 ++++++++++++++----
> Â1 files changed, 14 insertions(+), 4 deletions(-)
>
> I will take care of merging this patch. I'm just sending it for completeness...
>
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index 354ed3b..f9d6937 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -1151,6 +1151,16 @@ static int do_journal_get_write_access(handle_t *handle,
> Â Â Â Âreturn ext3_journal_get_write_access(handle, bh);
> Â}
>
> +/*
> + * Truncate blocks that were not used by write. We have to truncate the
> + * pagecache as well so that corresponding buffers get properly unmapped.
> + */
> +static void ext3_truncate_failed_write(struct inode *inode)
> +{
> + Â Â Â truncate_inode_pages(inode->i_mapping, inode->i_size);
> + Â Â Â ext3_truncate(inode);
> +}
> +
> Âstatic int ext3_write_begin(struct file *file, struct address_space *mapping,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âloff_t pos, unsigned len, unsigned flags,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âstruct page **pagep, void **fsdata)
> @@ -1209,7 +1219,7 @@ write_begin_failed:
> Â Â Â Â Â Â Â Âunlock_page(page);
> Â Â Â Â Â Â Â Âpage_cache_release(page);
> Â Â Â Â Â Â Â Âif (pos + len > inode->i_size)
> - Â Â Â Â Â Â Â Â Â Â Â ext3_truncate(inode);
> + Â Â Â Â Â Â Â Â Â Â Â ext3_truncate_failed_write(inode);
> Â Â Â Â}
> Â Â Â Âif (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
> Â Â Â Â Â Â Â Âgoto retry;
> @@ -1304,7 +1314,7 @@ static int ext3_ordered_write_end(struct file *file,
> Â Â Â Âpage_cache_release(page);
>
> Â Â Â Âif (pos + len > inode->i_size)
> - Â Â Â Â Â Â Â ext3_truncate(inode);
> + Â Â Â Â Â Â Â ext3_truncate_failed_write(inode);
> Â Â Â Âreturn ret ? ret : copied;
> Â}
>
> @@ -1330,7 +1340,7 @@ static int ext3_writeback_write_end(struct file *file,
> Â Â Â Âpage_cache_release(page);
>
> Â Â Â Âif (pos + len > inode->i_size)
> - Â Â Â Â Â Â Â ext3_truncate(inode);
> + Â Â Â Â Â Â Â ext3_truncate_failed_write(inode);
> Â Â Â Âreturn ret ? ret : copied;
> Â}
>
> @@ -1383,7 +1393,7 @@ static int ext3_journalled_write_end(struct file *file,
> Â Â Â Âpage_cache_release(page);
>
> Â Â Â Âif (pos + len > inode->i_size)
> - Â Â Â Â Â Â Â ext3_truncate(inode);
> + Â Â Â Â Â Â Â ext3_truncate_failed_write(inode);
> Â Â Â Âreturn ret ? ret : copied;
> Â}
>
> --
> 1.6.4.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at Âhttp://vger.kernel.org/majordomo-info.html
> Please read the FAQ at Âhttp://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/