Re: [PATCH v4] RISC-V: Show accurate per-hart isa in /proc/cpuinfo

From: Evan Green
Date: Fri Aug 25 2023 - 18:53:01 EST


On Fri, Aug 25, 2023 at 1:16 AM Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Aug 24, 2023 at 03:06:53PM -0700, Evan Green wrote:
> > On Thu, Aug 24, 2023 at 10:29 AM Conor Dooley <conor@xxxxxxxxxx> wrote:
> ...
> > > Do you want to have this new thing in cpuinfo tell the user "this hart
> > > has xyz extensions that are supported by a kernel, but maybe not this
> > > kernel" or to tell the user "this hart has xyz extensions that are
> > > supported by this kernel"? Your text above says "understood by the
> > > kernel", but I think that's a poor definition that needs to be improved
> > > to spell out exactly what you mean. IOW does "understood" mean the
> > > kernel will parse them into a structure, or does it mean "yes you can
> > > use this extension on this particular hart".
> >
> > I'm imagining /proc/cpuinfo being closer to "the CPU has it and the
> > kernel at least vaguely understands it, but may not have full support
> > for it enabled". I'm assuming /proc/cpuinfo is mostly used by 1)
> > humans wanting to know if they have hardware support for a feature,
> > and 2) administrators collecting telemetry to manage fleets (ie do I
> > have any hardware deployed that supports X).
>
> Is (2) a special case of (1)? (I want to make sure I understand all the
> cases.)

More or less, yes. In bucket two are also folks wondering things like
"are all these crash reports I'm getting specific to machines with X".

>
> > Programmers looking to
> > see "is the kernel support for this feature ready right now" would
> > ideally not be parsing /proc/cpuinfo text, as more direct mechanisms
> > like specific hwprobe bits for "am I fully ready to go" would be
> > easier to work with. Feel free to yell at me if this overall vision
> > seems flawed.
> >
> > I tried to look to see if there was consensus among the other
> > architectures. Aarch64 seems to go with "supported and fully enabled",
> > as their cpu_has_feature() directly tests elf_hwcap, and elements in
> > arm64_elf_hwcaps[] are Kconfig gated. X86 is complicated, but IIRC is
> > more along the lines of "hardware has it". They have two macros,
> > cpu_has() for "raw capability" and cpu_feature_enabled() for "kernel
> > can do it too", and they use cpu_has() for /proc/cpuinfo flags.
> >
>
> I'd lean more towards the way AArch64 goes, because, unless /proc/cpuinfo
> is just a blind regurgitation of an isa string from DT / ACPI, then the
> kernel must at least know something about it. Advertising a feature which
> is known, but also known not to work, seems odd to me. So my vote is that
> only features which are present and enabled in the kernel or present and
> not necessary to be enabled in the kernel in order for userspace or
> virtual machines to use be advertised in /proc/cpuinfo.
>
> We still have SMBIOS (dmidecode) to blindly dump what the hardware
> supports for cases (1) and (2) above.

Yeah, there's an argument to be made for that. My worry is it's a
difficult line to hold. The bar you're really trying to describe (or
at least what people might take away from it) is "if it's listed here
then it's fully ready to be used in userspace". But then things get
squishy when there are additional ways to control the use of the
feature (prctls() in init to turn it on, usermode policy to turn it
off, security doodads that disable it, etc). I'm assuming nobody wants
a version of /proc/cpuinfo that changes depending on which process is
asking. So then the line would have to be more carefully described as
"well, the hardware can do it, and the kernel COULD do it under some
circumstances, but YMMV", which ends up falling somewhat short of the
original goal. In my mind keeping /proc/cpuinfo as close to "here's
what the hardware can do" seems like a more defensible position.
-Evan