Re: [PATCH] fs/proc: introduce /proc/stat2 file

From: Vito Caputo
Date: Wed Nov 07 2018 - 15:32:41 EST


On Wed, Nov 07, 2018 at 11:03:06AM +0100, Miklos Szeredi wrote:
> On Wed, Nov 7, 2018 at 12:48 AM, Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Mon, 29 Oct 2018 23:04:45 +0000 Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> >
> >> On Mon, Oct 29, 2018 at 7:25 PM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
> >> > This patch introduces a new /proc/stat2 file that is identical to the
> >> > regular 'stat' except that it zeroes all hard irq statistics. The new
> >> > file is a drop in replacement to stat for users that need performance.
> >>
> >> For a while now, I've been thinking over ways to improve the
> >> performance of collecting various bits of kernel information. I don't
> >> think that a proliferation of special-purpose named bag-of-fields file
> >> variants is the right answer, because even if you add a few info-file
> >> variants, you're still left with a situation where a given file
> >> provides a particular caller with too little or too much information.
> >> I'd much rather move to a model in which userspace *explicitly* tells
> >> the kernel which fields it wants, with the kernel replying with just
> >> those particular fields, maybe in their raw binary representations.
> >> The ASCII-text bag-of-everything files would remain available for
> >> ad-hoc and non-performance critical use, but programs that cared about
> >> performance would have an efficient bypass. One concrete approach is
> >> to let users open up today's proc files and, instead of read(2)ing a
> >> text blob, use an ioctl to retrieve specified and targeted information
> >> of the sort that would normally be encoded in the text blob. Because
> >> callers would open the same file when using either the text or binary
> >> interfaces, little would have to change, and it'd be easy to implement
> >> fallbacks when a particular system doesn't support a particular
> >> fast-path ioctl.
>
> Please. Sysfs, with the one value per file rule, was created exactly
> for the purpose of eliminating these sort of problems with procfs. So
> instead of inventing special purpose interfaces for proc, just make
> the info available in sysfs, if not already available.
>

I like the sysfs approach to organizing the data, and have wanted
fd-batching IO syscalls in other circumstances anyways, so I think
there's a good possibility of something along those lines getting added
eventually.

At a past employer I had written some backup software which had to
reassemble versioned files from chains of reverse differentials (think
rdiff-backup). I had all the information needed to quickly construct a
multi-fd iovec to supply to a single batched readv syscall when
servicing versioned reads from a FUSE mount that involved a potentially
long chain of diffs, but no such syscall exists. The more
differentials, the more fragmented the operation tended to be, requiring
increasing numbers of smaller reads across more files to reconstruct the
buffer.

The same thing would be useful for making reads from large numbers of
sysfs files less costly. I presume proposing such a generally
applicable VFS API addition would meet less resistance than specialized
proc interfaces, perhaps naively :).

Regards,
Vito Caputo