Re: [PATCH 4/4] RISC-V: add support for vendor-extensions via AT_BASE_PLATFORM and xthead

From: Conor Dooley
Date: Wed Apr 26 2023 - 08:30:21 EST


Hey Heiko,

On Mon, Apr 24, 2023 at 09:49:11PM +0200, Heiko Stuebner wrote:
> From: Heiko Stuebner <heiko.stuebner@xxxxxxxx>
>
> T-Head cores support a number of own ISA extensions that also include
> optimized instructions which could benefit userspace to improve
> performance.
>
> Extensions supported by current T-Head cores are:
> * XTheadBa - bitmanipulation instructions for address calculation
> * XTheadBb - conditional basic bit-manipulation instructions
> * XTheadBs - instructions to access a single bit in a register
> * XTheadCmo - cache management operations
> * XTheadCondMov - conditional move instructions
> * XTheadFMemIdx - indexed memory operations for floating-point registers
> * XTheadFmv - double-precision floating-point high-bit data transmission
> intructions for RV32
> * XTheadInt - instructions to reduce the code size of ISRs and/or the
> interrupt latencies that are caused by ISR entry/exit code
> * XTheadMac - multiply-accumulate instructions
> * XTheadMemIdx - indexed memory operations for GP registers
> * XTheadMemPair - two-GPR memory operations
> * XTheadSync - multi-core synchronization instructions
>
> In-depth descriptions of these extensions can be found on
> https://github.com/T-head-Semi/thead-extension-spec
>
> Support for those extensions was merged into the relevant toolchains
> so userspace programs can select necessary optimizations when needed.
>
> So a mechanism to the isa-string generation to export vendor-extension
> lists via the errata mechanism and implement it for T-Head C9xx cores.
>
> This exposes these vendor extensions then both in AT_BASE_PLATFORM
> and /proc/cpuinfo.

I'm not entirely sure if this patch is meant to be a demo, but I don't
like the idea of using these registers to determine what extensions are
reported.
riscv,isa in a devicetree (for as much as I might dislike it at this
point in time), or the ACPI equivalent, should be the mechanism for
enabling/disabling these kinds of things.
Otherwise, we are just going to end up causing problems for ourselves
with various lists of this that and the other extension for different
combinations of hardware.
The open source c906 has the same archid/impid too right? Assuming this is
a serious proposal, how would you intend dealing with modified versions
of those cores?

I am pretty sure that you intended this to be a demo though, particularly
given the wording of the below quote from your cover, but in case you did
not:

NAK to this way of sourcing the information.

Anyways, since your cover's considerations section seems partly aimed at
me, given my discussion with head-the-ball last week:

> Things to still consider:
> -------------------------
> Right now both hwprobe and this approach will only pass through
> extensions the kernel actually knows about itself. This should not
> necessarily be needed (but could be an optional feature for e.g. virtualization).

What do you mean by virtualisation here? It's the job of the hypervisor
etc to make sure that what it passes to its guest contains only what it
wants the guest to see, right?
IIUC, that's another point against doing what this patch does.

> Most extensions don’t introduce new user-mode state that the kernel needs to
> manage (e.g. new registers). Extension that do introduce new user-mode state
> are usually disabled by default and have to be enabled by S mode or M mode
> (e.g. FS[1:0] for the +floating-point extension). So there should not be a
> reason to filter any extensions that are unknown.

I think in general this can be safely assumed, but I don't think it is
unreasonable to expect someone may make, for example, XConorGigaVector
that gets turned on by the same bits as regular old vector but has some
extra registers.
Not saying that I think that that is a good idea, but it is a distinct
possibility that this will happen, and I don't think forwarding it to
userspace is a good idea.

> So it might make more sense to just pass through a curated list (common
> set) created from the core's isa strings and remove state-handling
> extensions when they are not enabled in the kernel-side (sort of
> blacklisting extensions that need actual kernel support).

Yeah, as discussed with Christoph the other day I don't think we can
really avoid such a blacklist. I don't think it'd require any sort of
vendor specific handling, since, as you point out, a vendor may well
implement extensions that were created by other companies.

It's easy, right?? "Just" parse the dt, tokenise the extensions & delete
whatever is in the blacklist, right?

Hyperbole aside, I think that doing something like this increases the
need for a system like Evan's, as userspace may need a way to
differentiate between what the hardware is capable of (as reported by
the isa string in /proc/cpuinfo or the content of 3/4) and what the
kernel actually supports.

> However, this is a very related, but still independent discussion.

Aye, this discussion and the first two patches are relevant whether 3/4
is accepted or not IMO.

Cheers,
Conor.

>
> Signed-off-by: Heiko Stuebner <heiko.stuebner@xxxxxxxx>
> ---
> arch/riscv/errata/thead/errata.c | 43 ++++++++++++++++++++++++++++
> arch/riscv/include/asm/alternative.h | 4 +++
> arch/riscv/kernel/alternative.c | 21 ++++++++++++++
> arch/riscv/kernel/cpu.c | 12 ++++++++
> 4 files changed, 80 insertions(+)
>
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index 1036b8f933ec..eb635bf80737 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -15,6 +15,7 @@
> #include <asm/errata_list.h>
> #include <asm/hwprobe.h>
> #include <asm/patch.h>
> +#include <asm/switch_to.h>
> #include <asm/vendorid_list.h>
>
> static bool errata_probe_pbmt(unsigned int stage,
> @@ -125,3 +126,45 @@ void __init_or_module thead_feature_probe_func(unsigned int cpu,
> if ((archid == 0) && (impid == 0))
> per_cpu(misaligned_access_speed, cpu) = RISCV_HWPROBE_MISALIGNED_FAST;
> }
> +
> +
> +char *thead_extension_list_func(unsigned long archid,
> + unsigned long impid)
> +{
> + if ((archid == 0) && (impid == 0)) {
> + const char *xbase1 = "xtheadba_xtheadbb_xtheadbs_xtheadcmo_xtheadcondmov";
> + const char *xbase2 = "_xtheadint_xtheadmac_xtheadmemidx_xtheadmempair_xtheadsync";
> + const char *xfpu = "_xtheadfmemIdx";
> +#ifdef CONFIG_32BIT
> + const char *xfpu32 = "_xtheadfmv";
> +#endif
> + int len = strlen(xbase1) + strlen(xbase2);
> + char *str;
> +
> + if (has_fpu()) {
> + len += strlen(xfpu);
> +#ifdef CONFIG_32BIT
> + len+= strlen(xfpu32);
> +#endif
> + }
> +
> + str = kzalloc(len, GFP_KERNEL);
> + if (!str)
> + return str;
> +
> + strcpy(str, xbase1);
> +
> + if (has_fpu()) {
> + strcat(str, xfpu);
> +#ifdef CONFIG_32BIT
> + strcat(str, xfpu32);
> +#endif
> + }
> +
> + strcat(str, xbase2);
> +
> + return str;
> + }
> +
> + return NULL;
> +}
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index a8f5cf6694a1..8c9aec196649 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -31,6 +31,7 @@
> #define ALT_ALT_PTR(a) __ALT_PTR(a, alt_offset)
>
> void __init probe_vendor_features(unsigned int cpu);
> +char *list_vendor_extensions(void);
> void __init apply_boot_alternatives(void);
> void __init apply_early_boot_alternatives(void);
> void apply_module_alternatives(void *start, size_t length);
> @@ -55,6 +56,8 @@ void thead_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
>
> void thead_feature_probe_func(unsigned int cpu, unsigned long archid,
> unsigned long impid);
> +char *thead_extension_list_func(unsigned long archid,
> + unsigned long impid);
>
> void riscv_cpufeature_patch_func(struct alt_entry *begin, struct alt_entry *end,
> unsigned int stage);
> @@ -62,6 +65,7 @@ void riscv_cpufeature_patch_func(struct alt_entry *begin, struct alt_entry *end,
> #else /* CONFIG_RISCV_ALTERNATIVE */
>
> static inline void probe_vendor_features(unsigned int cpu) { }
> +static inline char *list_vendor_extensions(void) { return NULL; }
> static inline void apply_boot_alternatives(void) { }
> static inline void apply_early_boot_alternatives(void) { }
> static inline void apply_module_alternatives(void *start, size_t length) { }
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index fc65c9293ac5..18913fd1809f 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -29,6 +29,8 @@ struct cpu_manufacturer_info_t {
> unsigned int stage);
> void (*feature_probe_func)(unsigned int cpu, unsigned long archid,
> unsigned long impid);
> + char *(*extension_list_func)(unsigned long archid,
> + unsigned long impid);
> };
>
> static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_info_t *cpu_mfr_info)
> @@ -54,6 +56,7 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> case THEAD_VENDOR_ID:
> cpu_mfr_info->patch_func = thead_errata_patch_func;
> cpu_mfr_info->feature_probe_func = thead_feature_probe_func;
> + cpu_mfr_info->extension_list_func = thead_extension_list_func;
> break;
> #endif
> default:
> @@ -157,6 +160,24 @@ void __init_or_module probe_vendor_features(unsigned int cpu)
> cpu_mfr_info.imp_id);
> }
>
> +/*
> + * Lists the vendor-specific extensions common to all cores.
> + * Returns a new underscore "_" concatenated string that the
> + * caller is supposed to free after use.
> + */
> +char *list_vendor_extensions(void)
> +{
> + struct cpu_manufacturer_info_t cpu_mfr_info;
> +
> + riscv_fill_cpu_mfr_info(&cpu_mfr_info);
> + if (!cpu_mfr_info.extension_list_func)
> + return NULL;
> +
> + return cpu_mfr_info.extension_list_func(cpu_mfr_info.arch_id,
> + cpu_mfr_info.imp_id);
> +
> +}
> +
> /*
> * This is called very early in the boot process (directly after we run
> * a feature detect on the boot CPU). No need to worry about other CPUs
> diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
> index 71770563199f..6a0a45b2eb20 100644
> --- a/arch/riscv/kernel/cpu.c
> +++ b/arch/riscv/kernel/cpu.c
> @@ -7,6 +7,7 @@
> #include <linux/init.h>
> #include <linux/seq_file.h>
> #include <linux/of.h>
> +#include <asm/alternative.h>
> #include <asm/cpufeature.h>
> #include <asm/csr.h>
> #include <asm/hwcap.h>
> @@ -260,6 +261,7 @@ static char *riscv_create_isa_string(void)
> {
> int maxlen = 4;
> char *isa_str;
> + char *vendor_isa;
> int i;
>
> /* calculate the needed string length */
> @@ -268,6 +270,10 @@ static char *riscv_create_isa_string(void)
> maxlen++;
> maxlen += strlen_isa_ext();
>
> + vendor_isa = list_vendor_extensions();
> + if (vendor_isa)
> + maxlen += strlen(vendor_isa) + 1;
> +
> isa_str = kzalloc(maxlen, GFP_KERNEL);
> if (!isa_str)
> return ERR_PTR(-ENOMEM);
> @@ -287,6 +293,12 @@ static char *riscv_create_isa_string(void)
>
> strcat_isa_ext(isa_str);
>
> + if(vendor_isa) {
> + strcat(isa_str, "_");
> + strcat(isa_str, vendor_isa);
> + kfree(vendor_isa);
> + }
> +
> return isa_str;
> }
>
> --
> 2.39.0
>

Attachment: signature.asc
Description: PGP signature