Re: [PATCH v4] RISC-V: Show accurate per-hart isa in /proc/cpuinfo

From: Andrew Jones
Date: Sat Aug 26 2023 - 04:02:17 EST


On Fri, Aug 25, 2023 at 03:51:28PM -0700, Evan Green wrote:
> On Fri, Aug 25, 2023 at 1:16 AM Andrew Jones <ajones@xxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Aug 24, 2023 at 03:06:53PM -0700, Evan Green wrote:
> > > On Thu, Aug 24, 2023 at 10:29 AM Conor Dooley <conor@xxxxxxxxxx> wrote:
> > ...
> > > > Do you want to have this new thing in cpuinfo tell the user "this hart
> > > > has xyz extensions that are supported by a kernel, but maybe not this
> > > > kernel" or to tell the user "this hart has xyz extensions that are
> > > > supported by this kernel"? Your text above says "understood by the
> > > > kernel", but I think that's a poor definition that needs to be improved
> > > > to spell out exactly what you mean. IOW does "understood" mean the
> > > > kernel will parse them into a structure, or does it mean "yes you can
> > > > use this extension on this particular hart".
> > >
> > > I'm imagining /proc/cpuinfo being closer to "the CPU has it and the
> > > kernel at least vaguely understands it, but may not have full support
> > > for it enabled". I'm assuming /proc/cpuinfo is mostly used by 1)
> > > humans wanting to know if they have hardware support for a feature,
> > > and 2) administrators collecting telemetry to manage fleets (ie do I
> > > have any hardware deployed that supports X).
> >
> > Is (2) a special case of (1)? (I want to make sure I understand all the
> > cases.)
>
> More or less, yes. In bucket two are also folks wondering things like
> "are all these crash reports I'm getting specific to machines with X".
>
> >
> > > Programmers looking to
> > > see "is the kernel support for this feature ready right now" would
> > > ideally not be parsing /proc/cpuinfo text, as more direct mechanisms
> > > like specific hwprobe bits for "am I fully ready to go" would be
> > > easier to work with. Feel free to yell at me if this overall vision
> > > seems flawed.
> > >
> > > I tried to look to see if there was consensus among the other
> > > architectures. Aarch64 seems to go with "supported and fully enabled",
> > > as their cpu_has_feature() directly tests elf_hwcap, and elements in
> > > arm64_elf_hwcaps[] are Kconfig gated. X86 is complicated, but IIRC is
> > > more along the lines of "hardware has it". They have two macros,
> > > cpu_has() for "raw capability" and cpu_feature_enabled() for "kernel
> > > can do it too", and they use cpu_has() for /proc/cpuinfo flags.
> > >
> >
> > I'd lean more towards the way AArch64 goes, because, unless /proc/cpuinfo
> > is just a blind regurgitation of an isa string from DT / ACPI, then the
> > kernel must at least know something about it. Advertising a feature which
> > is known, but also known not to work, seems odd to me. So my vote is that
> > only features which are present and enabled in the kernel or present and
> > not necessary to be enabled in the kernel in order for userspace or
> > virtual machines to use be advertised in /proc/cpuinfo.
> >
> > We still have SMBIOS (dmidecode) to blindly dump what the hardware
> > supports for cases (1) and (2) above.
>
> Yeah, there's an argument to be made for that. My worry is it's a
> difficult line to hold. The bar you're really trying to describe (or
> at least what people might take away from it) is "if it's listed here
> then it's fully ready to be used in userspace". But then things get
> squishy when there are additional ways to control the use of the
> feature (prctls() in init to turn it on, usermode policy to turn it
> off, security doodads that disable it, etc). I'm assuming nobody wants
> a version of /proc/cpuinfo that changes depending on which process is
> asking. So then the line would have to be more carefully described as
> "well, the hardware can do it, and the kernel COULD do it under some
> circumstances, but YMMV", which ends up falling somewhat short of the
> original goal. In my mind keeping /proc/cpuinfo as close to "here's
> what the hardware can do" seems like a more defensible position.
> -Evan

I agree with that. I was actually even trying to say the same thing,
but only by bringing up virtual machines. Once we decide we'll expose
extensions to VMs, whether or not the host kernel enables them, then
none of the other host kernel configurations matter with respect to
advertising the feature, since the guest kernel may have a completely
different set of configurations.

So I think we should only be filtering out extensions that are disabled
because they're broken (have a detected erratum), have been "hidden"
(have a kernel command line allowing them to be treated as if not
present), or cannot be used at all due to missing accompanying hardware
descriptions (such as block size info for CBO extensions). In all cases,
I presume we'd wire checks up in riscv_isa_extension_check() and no
checks would be gated on Kconfigs or anything else. And, since
/proc/cpuinfo gets its list from the bitmap that's already filtered by
riscv_isa_extension_check(), then, long story short, we're good to go :-)

But maybe we can try to spell that policy out a bit more in
Documentation/riscv/uabi.rst.

Thanks,
drew