Re: [PATCH v3] mm: fix panic in __alloc_pages

From: Michal Hocko
Date: Tue Dec 14 2021 - 03:38:37 EST


On Mon 13-12-21 16:07:18, David Hildenbrand wrote:
> On 13.12.21 16:06, Michal Hocko wrote:
> > On Thu 09-12-21 11:48:42, Michal Hocko wrote:
> > [...]
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index 852041f6be41..2d38a431f62f 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -1161,19 +1161,21 @@ static void reset_node_present_pages(pg_data_t *pgdat)
> >> }
> >>
> >> /* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
> >> -static pg_data_t __ref *hotadd_new_pgdat(int nid)
> >> +static pg_data_t __ref *hotadd_init_pgdat(int nid)
> >> {
> >> struct pglist_data *pgdat;
> >>
> >> pgdat = NODE_DATA(nid);
> >> - if (!pgdat) {
> >> - pgdat = arch_alloc_nodedata(nid);
> >> - if (!pgdat)
> >> - return NULL;
> >>
> >> + /*
> >> + * NODE_DATA is preallocated (free_area_init) but its internal
> >> + * state is not allocated completely. Add missing pieces.
> >> + * Completely offline nodes stay around and they just need
> >> + * reintialization.
> >> + */
> >> + if (!pgdat->per_cpu_nodestats) {
> >> pgdat->per_cpu_nodestats =
> >> alloc_percpu(struct per_cpu_nodestat);
> >> - arch_refresh_nodedata(nid, pgdat);
> >
> > This should really be
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 42211485bcf3..2daa88ce8c80 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1173,7 +1173,7 @@ static pg_data_t __ref *hotadd_init_pgdat(int nid)
> > * Completely offline nodes stay around and they just need
> > * reintialization.
> > */
> > - if (!pgdat->per_cpu_nodestats) {
> > + if (pgdat->per_cpu_nodestats == &boot_nodestats) {
> > pgdat->per_cpu_nodestats =
> > alloc_percpu(struct per_cpu_nodestat);
> > } else {
> >
>
> I'll try giving this some churn later this week -- busy with other stuff.

Please hang on, this needs to be done yet slightly differently. I will
post something more resembling a final patch later today. For the
purpose of the testing this should be sufficient for now.
--
Michal Hocko
SUSE Labs