Re: [PATCH 2/7] mm, vmscan: add active list aging tracepoint

From: Michal Hocko
Date: Fri Dec 30 2016 - 04:26:48 EST


On Fri 30-12-16 10:48:53, Minchan Kim wrote:
> On Thu, Dec 29, 2016 at 08:52:46AM +0100, Michal Hocko wrote:
> > On Thu 29-12-16 14:33:59, Minchan Kim wrote:
> > > On Wed, Dec 28, 2016 at 04:30:27PM +0100, Michal Hocko wrote:
[...]
> > > > +TRACE_EVENT(mm_vmscan_lru_shrink_active,
> > > > +
> > > > + TP_PROTO(int nid, unsigned long nr_scanned, unsigned long nr_freed,
> > > > + unsigned long nr_unevictable, unsigned long nr_deactivated,
> > > > + unsigned long nr_rotated, int priority, int file),
> > > > +
> > > > + TP_ARGS(nid, nr_scanned, nr_freed, nr_unevictable, nr_deactivated, nr_rotated, priority, file),
> > >
> > > I agree it is helpful. And it was when I investigated aging problem of 32bit
> > > when node-lru was introduced. However, the question is we really need all those
> > > kinds of information? just enough with nr_taken, nr_deactivated, priority, file?
> >
> > Dunno. Is it harmful to add this information? I like it more when the
> > numbers just add up and you have a clear picture. You never know what
> > might be useful when debugging a weird behavior.
>
> Michal, I'm not huge fan of "might be useful" although it's a small piece of code.

But these are tracepoints. One of their primary reasons to exist is
to help debug things. And it is really hard to predict what might be
useful in advance. It is not like the patch would export numbers which
would be irrelevant to the reclaim.

> It adds just all of kinds overheads (memory footprint, runtime performance,
> maintainance) without any proved benefit.

Does it really add any measurable overhead or the maintenance burden? I
think the only place we could argue about is free_hot_cold_page_list
which is used in hot paths.

I think we can sacrifice it. The same for culled unevictable
pages. We wouldn't know what is the missing part
nr_taken-(nr_activate+nr_deactivate) because it could be either freed or
moved to the unevictable list but that could be handled in a separate
tracepoint in putback_lru_page which sounds like a useful thing I guess.

> If we allow such things, people would start adding more things with just "why not,
> it might be useful. you never know the future" and it ends up making linux fiction
> novel mess.

I agree with this concern in general, but is this the case in this
particular case?

> If it's necessary, someday, someone will catch up and will send or ask patch with
> detailed description "why the stat is important and how it is good for us to solve
> some problem".

I can certainly enhance the changelog. See below.

> From that, we can learn workload, way to solve the problem and git
> history has the valuable description so new comers can keep the community up easily.
> So, finally, overheads are justified and get merged.
>
> Please add must-have for your goal described.

My primary point is that tracepoints which do not give us a good picture
are quite useless and force us to add trace_printk or other means to
give us further information. Then I wonder why to have an incomplete
tracepoint at all.

Anyway, what do you think about this updated patch? I have kept Hillf's
A-b so please let me know if it is no longer valid.
---