Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree

From: Michal Hocko
Date: Tue Nov 09 2021 - 06:00:58 EST


On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> On 09.11.21 09:37, Michal Hocko wrote:
> > I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@xxxxxxxxxxxxxx
> > There was no response to that feedback. I will not go as far as to nack
> > it explicitly because pcp allocator is not an area I would nack patches
> > but seriously, this issue needs a deeper look rather than a paper over
> > patch. I hope we do not want to do a similar thing to all callers of
> > cpu_to_mem.
>
> While we could move it into the !HOLES version of cpu_to_mem(), calling
> cpu_to_mem() on an offline (and eventually not even present) CPU (with
> an offline node) is really a corner case.
>
> Instead of additional runtime overhead for all cpu_to_mem(), my take
> would be to just do it for the random special cases. Sure, we can
> document that people should be careful when calling cpu_to_mem() on
> offline CPUs. But IMHO it's really a corner case.

I suspect I haven't made myself clear enough. I do not think we should
be touching cpu_to_mem/cpu_to_node and handle this corner case. We
should be looking at the underlying problem instead. We cannot really
rely on cpu to be onlined to have a proper node association. We should
really look at the initialization code and handle this situation
properly. Memory less nodes are something we have been dealing with
already. This particular instance of the problem is new and we should
understand why.
--
Michal Hocko
SUSE Labs