Re: [PATCH] mm/hwpoison: fix wrong num_poisoned_pages account

From: Minchan Kim
Date: Mon Apr 04 2016 - 05:07:45 EST


On Mon, Apr 04, 2016 at 03:06:32PM +0900, Minchan Kim wrote:
> Currently, migration code increases num_poisoned_pages on failed
> migration page as well as successfully migrated one at the trial
> of memory-failure. It will make the stat wrong.
>
> As well, it marks page as PG_HWPoison even if the migration trial
> failed. It would make we cannot recover the corrupted page using
> memory-failure facility.
>
> This patches fixes it.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Reported-by: Vlastimil Babka <vbabka@xxxxxxx>
> Acked-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
> Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>

Hello Andrew,

This patch will make conflict with current mmotm which has
my non-lru page migration work.
It's okay to drop my non-lru page migration work to apply this
bug fix patch in current mmotm because I will try to support
userspace mapped drvier non-lru page Vlastimil pointed out
in that thread.

Thanks.

> ---
> mm/migrate.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6c822a7b27e0..f9dfb18a4eba 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -975,7 +975,13 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
> dec_zone_page_state(page, NR_ISOLATED_ANON +
> page_is_file_cache(page));
> /* Soft-offlined page shouldn't go through lru cache list */
> - if (reason == MR_MEMORY_FAILURE) {
> + if (reason == MR_MEMORY_FAILURE && rc == MIGRATEPAGE_SUCCESS) {
> + /*
> + * With this release, we free successfully migrated
> + * page and set PG_HWPoison on just freed page
> + * intentionally. Although it's rather weird, it's how
> + * HWPoison flag works at the moment.
> + */
> put_page(page);
> if (!test_set_page_hwpoison(page))
> num_poisoned_pages_inc();
> --
> 1.9.1
>