Re: [PATCH v4] vmscan: add trace events for lru_gen

From: Jaewon Kim
Date: Sun Oct 01 2023 - 19:41:45 EST


Hello Yu Zhao

Could you give me your comment? I am waiting for your opinion on the
reuse method. I'm planning to resend it as a complete patch with it.
Thank you.


On Tue, Sep 26, 2023 at 11:15 PM 김재원 <jaewon31.kim@xxxxxxxxxxx> wrote:
>
> >>>On Mon, Sep 25, 2023 at 10:20?PM Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> wrote:
> >>>>
> >>>> As the legacy lru provides, the lru_gen needs some trace events for
> >>>> debugging.
> >>>>
> >>>> This commit introduces 2 trace events.
> >>>> trace_mm_vmscan_lru_gen_scan
> >>>> trace_mm_vmscan_lru_gen_evict
> >>>>
> >>>> Each event is similar to the following legacy events.
> >>>> trace_mm_vmscan_lru_isolate,
> >>>> trace_mm_vmscan_lru_shrink_[in]active
> >>>
> >>>We should just reuse trace_mm_vmscan_lru_isolate and
> >>>trace_mm_vmscan_lru_shrink_inactive instead of adding new tracepoints.
> >>>
> >>>To reuse trace_mm_vmscan_lru_isolate, we'd just need to append two new
> >>>names to LRU_NAMES.
> >>>
> >>>The naming of trace_mm_vmscan_lru_shrink_inactive might seem confusing
> >>>but it's how MGLRU maintains the compatibility, e.g., the existing
> >>>active/inactive counters in /proc/vmstat.
> >>
> >>
> >>Hello
> >>
> >>Actually I had tried to reuse them. But some value was not that compatible.
> >>Let me try that way again.
> >>
> >>>
> >
> >Hello Yu Zhao
> >
> >Could you look into what I tried below? I reused the legacy trace events as you recommened.
> >
> >For the nr_scanned for trace_mm_vmscan_lru_shrink_inactive, I just used the scanned returned from isolate_folios.
> >I thought this is right as scan_folios also uses its isolated.
> > __count_vm_events(PGSCAN_ANON + type, isolated);
> >But I guess the scanned in scan_folios is actually the one used in shrink_inactive_list
>
> please ignore nr_scanned thing above I just misread the code.
>
> This is an example, I think it works well.
>
> mm_vmscan_lru_isolate: isolate_mode=0 classzone=2 order=0 nr_requested=4096 nr_scanned=64 nr_skipped=0 nr_taken=64 lru=inactive_file
> mm_vmscan_lru_shrink_inactive: nid=0 nr_scanned=64 nr_reclaimed=63 nr_dirty=0 nr_writeback=0 nr_congested=0 nr_immediate=0 nr_activate_anon=0 nr_activate_file=1 nr_ref_keep=0 nr_unmap_fail=0 priority=2 flags=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
>
> >
> >I tested this on both 0 and 7 of /sys/kernel/mm/lru_gen/enabled
> >
> >
> >diff --git a/mm/vmscan.c b/mm/vmscan.c
> >index a4e44f1c97c1..b61a0156559c 100644
> >--- a/mm/vmscan.c
> >+++ b/mm/vmscan.c
> >@@ -4328,6 +4328,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > int sorted = 0;
> > int scanned = 0;
> > int isolated = 0;
> >+ int skipped = 0;
> > int remaining = MAX_LRU_BATCH;
> > struct lru_gen_folio *lrugen = &lruvec->lrugen;
> > struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> >@@ -4341,7 +4342,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> >
> > for (i = MAX_NR_ZONES; i > 0; i--) {
> > LIST_HEAD(moved);
> >- int skipped = 0;
> >+ int skipped_zone = 0;
> > int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES;
> > struct list_head *head = &lrugen->folios[gen][type][zone];
> >
> >@@ -4363,16 +4364,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > isolated += delta;
> > } else {
> > list_move(&folio->lru, &moved);
> >- skipped += delta;
> >+ skipped_zone += delta;
> > }
> >
> >- if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH)
> >+ if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH)
> > break;
> > }
> >
> >- if (skipped) {
> >+ if (skipped_zone) {
> > list_splice(&moved, head);
> >- __count_zid_vm_events(PGSCAN_SKIP, zone, skipped);
> >+ __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone);
> >+ skipped += skipped_zone;
> > }
> >
> > if (!remaining || isolated >= MIN_LRU_BATCH)
> >@@ -4387,6 +4389,9 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > __count_memcg_events(memcg, item, isolated);
> > __count_memcg_events(memcg, PGREFILL, sorted);
> > __count_vm_events(PGSCAN_ANON + type, isolated);
> >+ trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH,
> >+ scanned, skipped, isolated,
> >+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
> >
> > /*
> > * There might not be eligible folios due to reclaim_idx. Check the
> >@@ -4517,6 +4522,9 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
> > retry:
> > reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
> > sc->nr_reclaimed += reclaimed;
> >+ trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
> >+ scanned, reclaimed, &stat, sc->priority,
> >+ type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
> >
> > list_for_each_entry_safe_reverse(folio, next, &list, lru) {
> > if (!folio_evictable(folio)) {
> >