RE: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation

From: Luck, Tony
Date: Fri Aug 13 2021 - 11:07:24 EST


I'm running the default case from my einj_mem_uc test. That just:

1) allocates a page using:

mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANON, -1, 0);

2) fills the page with random data (to make sure it has been allocated, and that the kernel can't
do KSM tricks to share this physical page with some other user).

3) injects the error at a 1KB offset within the page.

4) does a memory read of the poison address.


> action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED);
> + dump_page(p, "hwpoison unknown page");
> res = -EBUSY;
> goto unlock_mutex;
> }

I added that patch against upstream (v5.14-rc5). Here's the dump. The "pfn" matches the physical address where I injected,
and it has the hwpoison flag bit that was set early in memory_failure() --- so this is the right page.

[ 79.368212] Memory failure: 0x623889: recovery action for unknown page: Ignored
[ 79.375525] page:0000000065ad9479 refcount:3 mapcount:1 mapping:00000000a4ac843b index:0x0 pfn:0x623889
[ 79.384909] memcg:ff40a569f2966000
[ 79.388313] aops:shmem_aops ino:4c00 dentry name:"dev/zero"
[ 79.393896] flags: 0x17ffffc088000c(uptodate|dirty|swapbacked|hwpoison|node=0|zone=2|lastcpupid=0x1fffff)
[ 79.403455] raw: 0017ffffc088000c 0000000000000000 dead000000000122 ff40a569f45a7160
[ 79.411191] raw: 0000000000000000 0000000000000000 0000000300000000 ff40a569f2966000
[ 79.418931] page dumped because: hwpoison unknown page


-Tony