Re: [PATCH] vmstat: don't auto expand the sysfs files

From: Pasha Tatashin
Date: Thu Dec 14 2023 - 13:58:10 EST


Hi David,

Thank you for taking a look at this patch, my replies below.

On Thu, Dec 14, 2023 at 12:52 PM David Rientjes <rientjes@xxxxxxxxxx> wrote:
>
> On Mon, 11 Dec 2023, Pasha Tatashin wrote:
>
> > Whenever a new fields are added one of the following: node_stat_item
> > numa_stat_item zone_stat_item, the /sys/devices/system/node/nodeX/vmstat
> > files are auto expanded.
> >
> > This is a problem, as sysfs files should be only one value per file.
>
> Does this patch address the one-value-per-file issue? (I think that ship
> has sailed for vmstat.)

That ship has sailed for vmstat, this patch addresses what was asked
by GregKH: not to add new values to vmstat, as not to make the
existing problem even worse. The sysfs file system has a one page
limit per file. The developers will decide how to export the new items
added to node_stat, numa_stat, zone_stat individually. Each new item
can be exported in its own files, and must have its own documentation
about interface stability, value meaning, and expectations when the
stat file is absent.

> /sys/devices/system/node/nodeX/vmstat has been documented as a stable ABI
> in Linux for 13 years.
>
> That said, the contents of the file has not been documented so I assume
> it's "whatever stats make sense for the current implementation of the
> Linux VM".
>
> > Also, once a field is exported via vmstat it is hard to remove it as
> > there could be user applications that rely on this field. This is why
> > we still cary "nr_unstable 0" in /proc/vmstat that is not used.
> >
>
> Implementations change over time, so this would be expected.
>
> I'm assuming, but perhaps incorrectly, that userspace won't crash if
> nr_unstable just don't appear anymore. That whoever is using it would
> just assume that it's zero if it doesn't appear.
>
> So I think we need to answer the question of: are the *contents* of files
> like vmstat that are heavily dependent on implementation level details
> really part of a stable ABI that people can expect will carry on forever?

I agree, but that is outside of the scope of this patch. The intent of
this patch is to keep the existing interfaces, and only prevents
future auto expansion of vmstat files. In the future, we work on
documenting the existing vmstat interfaces, and perhaps cleaning-up
them when possible.

> > Also, since vmstat is auto-expanded the fields are not documented, so
> > users do not know whether they are counted in bytes/kilobytes/pages,
> > and the exact meaning of these fields.
> >
>
> I think that's actually intended since there can also be ones that are
> event counters. I don't think any fields in vmstat are intended to be
> long-term sacred stable ABIs.

Right, but we already carry fields i.e nr_unstable that are hardcoded,
but were removed from the kernel otherwise.

Thank you,
Pasha