Re: [PATCH 4/8] proc: export more page flags in /proc/kpageflags

From: Ingo Molnar
Date: Mon May 11 2009 - 18:09:29 EST



* Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, 11 May 2009 13:45:54 +0200
> Ingo Molnar <mingo@xxxxxxx> wrote:
>
> > > Yes, we could place pagemap's two auxiliary files into debugfs but
> > > it would be rather stupid to split the feature's control files
> > > across two pseudo filesystems, one of which may not even exist.
> > > Plus pagemap is not a kernel debugging feature.
> >
> > That's not what i'm suggesting though.
> >
> > What i'm suggesting is that there's a zillion ways to enumerate
> > and index various kernel objects, doing that in /proc is
> > fundamentally wrong. And there's no need to create a per PID/TID
> > directory structure in /debug either, to be able to list and
> > access objects by their PID.
>
> The problem with procfs was that it was growing a lot of random
> non-process-related stuff. We never deprecated procfs - we
> decided that it should be retained for its original purpose and
> that non-process-realted things shouldn't go in there.
>
> The /proc/<pid>/pagemap file clearly _is_ process-related, and
> /proc/<pid> is the natural and correct place for it to live.
>
> Yes, sure, there are any number of ways in which that data could
> be presented to userspace in other locations and via other means.
> But there would need to be an extraordinarily good reason for
> violating the existing paradigm/expectation/etc.

It has also been clearly demonstrated in this thread that people
want more enumeration than just the the process dimension.

_Especially_ for an object like pages. Often most of the memory in a
Linux system is _not mapped to any process_. It is in the page
cache. Still, /proc enumeration does not capture it. Why? Because
IMO it has been done at the wrong layer, at the wrong abstraction
level.

Yes, /proc is for process enumeration (as the name tells us
already), but it is not really suitable as a general object
enumerator for kernel debugging or kernel instrumentation purposes.

By putting kernel instrumentation into /proc, we limit all _future_
enumeration greatly. Instead of adding just another iterator
(walker), we now have to move the whole thing across into another
domain (which is being resisted, and /proc is an ABI anyway).

It's all doable, but a lot harder if it's not being relized why it's
important to do it.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/