Re: [PATCH v1] mm, pagemap: expose hwpoison entry

From: Naoya Horiguchi
Date: Mon Oct 04 2021 - 10:32:40 EST


On Mon, Oct 04, 2021 at 01:55:30PM +0200, David Hildenbrand wrote:
> On 04.10.21 13:50, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> >
> > A hwpoison entry is a non-present page table entry to report
> > memory error events to userspace. If we have an easy way to know
> > which processes have hwpoison entries, that might be useful for
> > user processes to take proper actions. But we don't have it now.
> > So make pagemap interface expose hwpoison entries to userspace.
>
> Noting that this is only a way to inspect hwpoison set for private anonymous
> memory. You cannot really identify anything related to shared memory.
>
> Do you also handle private hugetlb pages?

I think yes. As long as hugepages are mmap()ed, we should be able to
identify them with hwpoison entry (even if used via private/shared mapping).

>
> >
> > Hwpoison entry for hugepage is also exposed by this patch. The below
> > example shows how pagemap is visible in the case where a memory error
> > hit a hugepage mapped to a process.
> >
> > $ ./page-types --no-summary --pid $PID --raw --list --addr 0x700000000+0x400
> > voffset offset len flags
> > 700000000 12fa00 1 ___U_______Ma__H_G_________________f_______1
> > 700000001 12fa01 1ff ___________Ma___TG_________________f_______1
> > 700000200 12f800 1 __________B________X_______________f______w_
> > 700000201 12f801 1 ___________________X_______________f______w_ // memory failure hit this page
> > 700000202 12f802 1fe __________B________X_______________f______w_
> >
> > The entries with both of "X" flag (hwpoison flag) and "w" flag (swap
> > flag) are considered as hwpoison entries. So all pages in 2MB range
> > are inaccessible from the process. We can get actual error location
> > by page-types in physical address mode.
> >
> > $ ./page-types --no-summary --addr 0x12f800+0x200 --raw --list
> > offset len flags
> > 12f800 1 __________B_________________________________
> > 12f801 1 ___________________X________________________
> > 12f802 1fe __________B_________________________________
> >
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> > ---
> > fs/proc/task_mmu.c | 41 ++++++++++++++++++++++++++++++++---------
> > include/linux/swapops.h | 13 +++++++++++++
> > tools/vm/page-types.c | 7 ++++++-
> > 3 files changed, 51 insertions(+), 10 deletions(-)
>
>
> Please also update the documentation located at
>
> Documentation/admin-guide/mm/pagemap.rst

I will do this in the next post.

Thanks,
Naoya Horigcuhi