RE: mm/DAMON: Profiling enhancements for DAMON

From: Prasad, Aravinda
Date: Fri Dec 15 2023 - 05:08:09 EST




> -----Original Message-----
> From: Yu Zhao <yuzhao@xxxxxxxxxx>
> Sent: Friday, December 15, 2023 2:03 PM
> To: Prasad, Aravinda <aravinda.prasad@xxxxxxxxx>
> Cc: damon@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; sj@xxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; s2322819@xxxxxxxx; Kumar, Sandeep4
> <sandeep4.kumar@xxxxxxxxx>; Huang, Ying <ying.huang@xxxxxxxxx>;
> Hansen, Dave <dave.hansen@xxxxxxxxx>; Williams, Dan J
> <dan.j.williams@xxxxxxxxx>; Subramoney, Sreenivas
> <sreenivas.subramoney@xxxxxxxxx>; Kervinen, Antti
> <antti.kervinen@xxxxxxxxx>; Kanevskiy, Alexander
> <alexander.kanevskiy@xxxxxxxxx>; Alan Nair <alan.nair@xxxxxxxxx>; Juergen
> Gross <jgross@xxxxxxxx>; Ryan Roberts <ryan.roberts@xxxxxxx>
> Subject: Re: mm/DAMON: Profiling enhancements for DAMON
>
> On Fri, Dec 15, 2023 at 12:42 AM Aravinda Prasad
> <aravinda.prasad@xxxxxxxxx> wrote:
> ...
>
> > This patch proposes profiling different levels of the application’s
> > page table tree to detect whether a region is accessed or not. This
> > patch is based on the observation that, when the accessed bit for a
> > page is set, the accessed bits at the higher levels of the page table
> > tree (PMD/PUD/PGD) corresponding to the path of the page table walk
> > are also set. Hence, it is efficient to check the accessed bits at
> > the higher levels of the page table tree to detect whether a region is
> > accessed or not.
>
> This patch can crash on Xen. See commit 4aaf269c768d("mm: introduce
> arch_has_hw_nonleaf_pmd_young()")

Will fix as suggested in the commit.

>
> MGLRU already does this in the correct way. See mm/vmscan.c.

I don't see access bits at PUD or PGD checked for 4K page size. Can you
point me to the code where access bits are checked at PUD and PGD level?

>
> This patch also can cause USER DATA CORRUPTION. See commit
> c11d34fa139e ("mm/damon/ops-common: atomically test and clear young
> on ptes and pmds").

Ok. Will atomically test and set the access bits.

>
> The quality of your patch makes me very much doubt the quality of your
> paper, especially your results on Google's kstaled and MGLRU in table 6.2.

The results are very much reproducible. We have not used kstaled/MGLRU for
the data in Figure 3, but we linearly scan pages similar to kstaled by implementing
a kernel thread for scanning.

Our argument for kstaled/MGLRU is that, scanning individual pages at 4K
granularity may not be efficient for large footprint applications. Instead,
access bits at the higher level of the page table tree can be used. In the
paper we have demonstrated this with DAMON but the concept can be
applied to kstaled/MGLRU as well.

Regards,
Aravinda