Re: [PATCH 00/14] per memcg lru_lock

From: Hugh Dickins
Date: Fri Aug 23 2019 - 22:01:44 EST


On Wed, 21 Aug 2019, Alex Shi wrote:
> å 2019/8/21 äå2:24, Hugh Dickins åé:
> > I'll set aside what I'm doing, and switch to rebasing ours to v5.3-rc
> > and/or mmotm. Then compare with what Alex has, to see if there's any
> > good reason to prefer one to the other: if no good reason to prefer ours,
> > I doubt we shall bother to repost, but just use it as basis for helping
> > to review or improve Alex's.
>
> For your review, my patchset are pretty straight and simple.
> It just use per lruvec lru_lock to replace necessary pgdat lru_lock.
> just this. We could talk more after I back to work. :)

Sorry to be bearer of bad news, Alex, but when you said "straight and
simple", I feared that your patchset would turn out to be fundamentally
too simple.

And that is so. I have only to see the
lruvec = mem_cgroup_page_lruvec(page, pgdat);
line in isolate_migratepages_block() in mm/compaction.c, and check
that mem_cgroup_page_lruvec() is little changed in mm/mempolicy.c.

The central problem with per-memcg lru_lock is that you do not know
for sure what lock to take (which memcg a page belongs to) until you
have taken the expected lock, and then checked whether page->memcg
is still the same - backing out and trying again if not.

Fix that central problem, and you end up with a more complicated
patchset, much like ours. It's true that when ours was first developed,
the memcg situation was more complicated in several ways, and perhaps
some aspects of our patchset could be simplified now (though I've not
identified any). Johannes in particular has done a great deal of
simplifying work in memcg over the last few years, but there are still
situations in which a page's memcg can change (move_charge_at_immigrate
and swapin readahead spring to mind - or perhaps the latter is only an
issue when MEMCG_SWAP is not enabled, I forget; and I often wonder if
reparenting will be brought back one day).

I did not review your patchset in detail, and wasn't able to get very
far in testing it. At first I was put off by set_task_reclaim_state
warnings from mm/vmscan.c, but those turned out to be in v5.3-rc5
itself, not from your patchset or mine (but I've not yet investigated
what's responsible for them). Within a minute of starting swapping
load, kcompactd compact_lock_irqsave() in isolate_migratepages_block()
would deadlock, and I did not get further. (Though I did also notice
that booting the CONFIG_MEMCG=y kernel with "cgroup_disable=memory"
froze in booting - tiresomely, one has to keep both the memcg and
no-memcg locking to cope with that case, and I guess you had not.)

Rather than duplicating effort, I would advise you to give our patchset
a try, and if it works for you, help towards getting that one merged:
but of course, it's up to you.

I've attached a tarfile of it rebased to v5.3-rc5: I do not want to
spam the list with patches yet, because I do not have any stats or
argument in support of the series, as Andrew asked for years ago and
Michal asks again now. But aside from that I consider it ready, and
will let Shakeel take it over from here, while I get back to what I
diverted from (but of course I'll try to answer questions on it).

Thanks,
Hugh

Attachment: lrulock503.tar
Description: per-memcg lru_lock on v5.3-rc5