Re: [v3 PATCH 4/5] mm: shmem: don't truncate page if memory failure happens

From: Naoya Horiguchi
Date: Fri Oct 01 2021 - 03:05:54 EST


On Thu, Sep 30, 2021 at 02:53:10PM -0700, Yang Shi wrote:
> The current behavior of memory failure is to truncate the page cache
> regardless of dirty or clean. If the page is dirty the later access
> will get the obsolete data from disk without any notification to the
> users. This may cause silent data loss. It is even worse for shmem
> since shmem is in-memory filesystem, truncating page cache means
> discarding data blocks. The later read would return all zero.
>
> The right approach is to keep the corrupted page in page cache, any
> later access would return error for syscalls or SIGBUS for page fault,
> until the file is truncated, hole punched or removed. The regular
> storage backed filesystems would be more complicated so this patch
> is focused on shmem. This also unblock the support for soft
> offlining shmem THP.
>
> Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> ---
...
> @@ -894,6 +896,12 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p)
> goto out;
> }
>
> + /*
> + * The shmem page is kept in page cache instead of truncating
> + * so need decrement the refcount from page cache.
> + */

This comment seems to me confusing because no refcount is decremented here.
What the variable dec tries to do is to give the expected value of the
refcount of the error page after successfull erorr handling, which differs
according to the page state before error handling, so dec adjusts it.

How about the below?

+ /*
+ * The shmem page is kept in page cache instead of truncating
+ * so is expected to have an extra refcount after error-handling.
+ */

> + dec = shmem_mapping(mapping);
> +
> /*
> * Truncation is a bit tricky. Enable it per file system for now.
> *
...
> @@ -2466,7 +2467,17 @@ shmem_write_begin(struct file *file, struct address_space *mapping,
> return -EPERM;
> }
>
> - return shmem_getpage(inode, index, pagep, SGP_WRITE);
> + ret = shmem_getpage(inode, index, pagep, SGP_WRITE);
> +
> + if (*pagep) {
> + if (PageHWPoison(*pagep)) {

Unless you plan to add some code in the near future, how about merging
these two if sentences?

if (*pagep && PageHWPoison(*pagep)) {

Thanks,
Naoya Horiguchi

> + unlock_page(*pagep);
> + put_page(*pagep);
> + ret = -EIO;
> + }
> + }
> +
> + return ret;
> }
>
> static int