Re: [PATCH] mm: don't warn if the node is offlined

From: Yang Shi
Date: Wed Nov 02 2022 - 14:18:51 EST


On Wed, Nov 2, 2022 at 10:47 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Wed 02-11-22 10:36:07, Yang Shi wrote:
> > On Wed, Nov 2, 2022 at 9:15 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Wed 02-11-22 09:03:57, Yang Shi wrote:
> > > > On Wed, Nov 2, 2022 at 12:39 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > > > >
> > > > > On Tue 01-11-22 12:13:35, Zach O'Keefe wrote:
> > > > > [...]
> > > > > > This is slightly tangential - but I don't want to send a new mail
> > > > > > about it -- but I wonder if we should be doing __GFP_THISNODE +
> > > > > > explicit node vs having hpage_collapse_find_target_node() set a
> > > > > > nodemask. We could then provide fallback nodes for ties, or if some
> > > > > > node contained > some threshold number of pages.
> > > > >
> > > > > I would simply go with something like this (not even compile tested):
> > > >
> > > > Thanks, Michal. It is definitely an option. As I talked with Zach, I'm
> > > > not sure whether it is worth making the code more complicated for such
> > > > micro optimization or not. Removing __GFP_THISNODE or even removing
> > > > the node balance code should be fine too IMHO. TBH I doubt there would
> > > > be any noticeable difference.
> > >
> > > I do agree that an explicit nodes (quasi)round robin sounds over
> > > engineered. It makes some sense to try to target the prevalent node
> > > though because this code can be executed from khugepaged and therefore
> > > allocating with a completely different affinity than the original fault.
> >
> > Yeah, the corner case comes from the node balance code, it just tries
> > to balance between multiple prevalent nodes, so you agree to remove it
> > IIRC?
>
> Yeah, let's just collect all good nodes into a nodemask and keep
> __GFP_THISNODE in place. You can consider having the nodemask per collapse_control
> so that you allocate it only once in the struct lifetime.

Actually my intention is more aggressive, just remove that node balance code.

>
> And as mentioned in other reply it would be really nice to hide this
> under CONFIG_NUMA (in a standalong follow up of course).

The hpage_collapse_find_target_node() function itself is defined under
CONFIG_NUMA.

>
> --
> Michal Hocko
> SUSE Labs