Re: [PATCH v2 2/9] mm/vmscan: protect the workingset on anonymous LRU

From: Johannes Weiner
Date: Mon Mar 16 2020 - 12:12:16 EST


On Mon, Mar 16, 2020 at 04:05:30PM +0900, Joonsoo Kim wrote:
> 2020ë 3ì 14ì (í) ìì 4:55, Johannes Weiner <hannes@xxxxxxxxxxx>ëì ìì:
> > The problem with executables is that when they are referenced, they
> > get a *lot* of references compared to data pages. Think about an
> > instruction stream and how many of those instructions result in data
> > references. So when you see an executable page that is being accessed,
> > it's likely being accessed at a high rate. They're much hotter, and
> > that's why reference bits from VM_EXEC mappings carry more weight.
>
> Sound reasonable.
>
> But, now, I have another thought about it. I think that the root of the reason
> of this check is that there are two different reference tracking models
> on file LRU. If we assume that all file pages are mapped ones, this special
> check isn't needed. If executable pages are accessed more frequently than
> others, reference can be found more for them at LRU cycle. No need for this
> special check.
>
> However, file pages includes syscall-ed pages and mapped pages, and,
> reference tracking models for mapped page has a disadvantage to
> one for syscall-ed page. Several references for mapped page would be
> considered as one at LRU cycle. I think that this check is to mitigate
> this disadvantage, at least, for executable file mapped page.

Hm, I don't quite understand this reasoning. Yes, there are two models
for unmapped and mapped file pages. But both types of pages get
activated with two or more references: for unmapped it's tracked
through mark_page_accessed(), and for mapped it's the two LRU cycles
with the referenced bit set (unmapped pages don't get that extra trip
around the LRU with one reference). With that alone, mapped pages and
unmapped pages should have equal chances, no?

> > IMO this applies to executable file and anon equally.
>
> anon LRU consist of all mapped pages, so, IMHO, there is no need for
> special handling for executable pages on anon LRU. Although reference
> is just checked at LRU cycle, it would correctly approximate reference
> frequency.
>
> Moreover, there is one comment in shrink_active_list() that explains one
> reason about omission of this check for anon pages.
>
> "Anon pages are not likely to be evicted by use-once streaming IO, plus JVM
> can create lots of anon VM_EXEC pages, so we ignore them here."
>
> Lastly, as I said before, at least, it is done separately with appropriate
> investigation.

You do have a point here, though.

The VM_EXEC protection is a heuristic that I think was added for one
specific case. Instead of adopting it straight-away for anon pages, it
may be good to wait until a usecase materializes. If it never does,
all the better - one less heuristic to carry around.

> Now, I plan to make a next version applied all your comments except
> VM_EXEC case. As I said above, fundamental difference between
> file mapped page and anon mapped page is that file LRU where file mapped
> pages are managed uses two reference tracking model but anon LRU uses
> just one. File LRU needs some heuristic to adjust the difference of two models,
> but, anon LRU doesn't need at all. And, I think VM_EXEC check is for this case.
>
> Please let me know your opinion about VM_EXEC case. I will start to rework
> if you agree with my thought.

That sounds like a good plan. I'm looking forward to the next version!

Johannes