Re: [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get stable zone's values

From: Michal Hocko
Date: Thu Jun 03 2021 - 08:45:16 EST


On Thu 03-06-21 10:38:24, Oscar Salvador wrote:
> On Wed, Jun 02, 2021 at 09:45:58PM +0200, Oscar Salvador wrote:
> > It was too nice and easy to be true I guess.
> > Yeah, I missed the point that blocking might be an issue here, and hotplug
> > operations can take really long, so not an option.
> > I must have switched my brain off back there, because now it is just too
> > obvious.
> >
> > Then I guwmess that seqlock must stay and the only thing than can go is the
> > pgdat resize lock from the hotplug code.
>
> So, I have been looking into this again.
> Of course, the approach taken here is outrageously wrong, but there are
> some other things that are a bit confusing.
>
> As pointed out, bad_range() (the function that ends up calling
> page_outside_zone_boundaries) is called from different functions via VM_BUG_ON_*.
> page_outside_zone_boundaries() takes care of taking the seqlock to avoid
> reading stale values that might happen if we race with memhotplug
> operations.
> page_outside_zone_boundaries() calls zone_spans_pfn() to check that.
>
> Now on the funny thing.
>
> We do have several places happily calling zone_spans_pfn() without
> holding zone's seqlock, e.g:
>
> set_pageblock_migratetype
> set_pfnblock_flags_mask
> zone_spans_pfn
>
> move_freepages_block
> zone_spans_pfn
>
> alloc_contig_pages
> zone_spans_last_pfn
> zone_spans_pfn
>
> Those places hold zone->lock, while move_pfn_range_to_zone() and
> remove_pfn_range_from_zone() hold zone->seqlock, so AFAICS, those places
> could read a stale value and proceed thinking the range is within the
> zone while it is not.
>
> So I guess my question is, should we force those places to take the
> seqlock reader as we do in page_outside_zone_boundaries(), (or maybe
> just move the seqlock handling to zone_spans_pfn())?

I believe we need to define the purpose of the locking first. The
existing locking doesn't serve much purpose, does it? The state might
change right after the lock is released and the caller cannot really
rely on the result. So aside of the current implementation, I would
argue that any locking has to be be done on the caller layer.

But the primary question is whether anybody actually cares about
potential races in the first place.
--
Michal Hocko
SUSE Labs