Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0

From: Michal Hocko
Date: Wed Oct 16 2019 - 10:10:03 EST


On Wed 16-10-19 15:55:00, David Hildenbrand wrote:
> On 16.10.19 15:45, Michal Hocko wrote:
[...]
> > There is state stored in the struct page. In other words this shouldn't
> > be really different from HWPoison pages. I cannot find the code that is
> > doing that and maybe we don't handle that. But we cannot simply online
> > hwpoisoned page. Offlining the range will not make a broken memory OK
> > all of the sudden. And your usecase sounds similar to me.
>
> Sorry to say, but whenever we online memory the memmap is overwritten,
> because there is no way you could tell it contains garbage or not. You have
> to assume it is garbage. (my recent patch even poisons the memmap when
> offlining, which helped to find a lot of these "garbage memmap" BUGs)
>
> online_pages()
> ...
> move_pfn_range_to_zone(zone, pfn, nr_pages, NULL);
> ...
> memmap_init_zone()
> -> memmap initialized
>
> So yes, offlining memory with HWPoison and re-onlining it effectively drops
> HWPoison markers. On the next access, you will trigger a new HWPoison.

Right you are! I need to sit on this much more and think about it with a
clean head.
--
Michal Hocko
SUSE Labs