Re: [PATCH v2] RISC-V: Add mvendorid, marchid, and mimpid to /proc/cpuinfo output

From: Palmer Dabbelt
Date: Thu Oct 13 2022 - 17:25:29 EST


On Mon, 03 Oct 2022 21:35:25 PDT (-0700), Palmer Dabbelt wrote:
On Mon, 03 Oct 2022 20:44:54 PDT (-0700), anup@xxxxxxxxxxxxxx wrote:
On Tue, Oct 4, 2022 at 8:24 AM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote:

On Tue, 26 Jul 2022 21:38:29 PDT (-0700), apatel@xxxxxxxxxxxxxxxx wrote:
> Identifying the underlying RISC-V implementation can be important
> for some of the user space applications. For example, the perf tool
> uses arch specific CPU implementation id (i.e. CPUID) to select a
> JSON file describing custom perf events on a CPU.
>
> Currently, there is no way to identify RISC-V implementation so we
> add mvendorid, marchid, and mimpid to /proc/cpuinfo output.
>
> Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>
> Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@xxxxxxxxxxxxx>
> Tested-by: Nikita Shubin <n.shubin@xxxxxxxxx>
> ---
> Changes since v1:
> - Use IS_ENABLED() to check CONFIG defines
> - Added RB and TB tags in commit description
> ---
> arch/riscv/kernel/cpu.c | 51 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 51 insertions(+)

Sorry for being slow on this one. I'd been worried about sticking more
stuff into /proc/cpuinfo, as having the only user interface for these
involve parsing /proc/cpuinfo is really just a recipe for disaster. I
didn't get around do doing something better, though, and waiting for
another release seems kind of silly.

In addition to being useful for perf tool, printing vendor and
implementation details in /proc/cpuinfo is a very useful info.

Users can easily know the underlying CPU implementer using
"cat /proc/cpuinfo".


I was going to go put this on for-next, but it looks like it's causing
kasan boot failures. I'm not actually quite sure how this is triggering
these, at least based on the backtrace, but there's a bunch of them and
boot hangs.

Here's the first of them:

[ 3.830416] Hardware name: riscv-virtio,qemu (DT)
[ 3.830828] epc : kasan_check_range+0x116/0x14c
[ 3.832377] ra : memset+0x1e/0x4c
[ 3.832699] epc : ffffffff8026468e ra : ffffffff80264d6e sp : ff600000832afa60
[ 3.833051] gp : ffffffff81d800a0 tp : ff600000831d2400 t0 : ff60000083224360
[ 3.833397] t1 : ffebfffefffef000 t2 : 0000000000000000 s0 : ff600000832afa90
[ 3.833811] s1 : 0000000000000004 a0 : 0000000000000010 a1 : 0000000000000004
[ 3.834210] a2 : 0000000000000001 a3 : ffffffff801fa22c a4 : ff5ffffffff78000
[ 3.834558] a5 : ffebfffefffef000 a6 : ffebfffefffef001 a7 : ff5ffffffff78003
[ 3.834931] s2 : ff5ffffffff78000 s3 : 0000000000000000 s4 : ffffffff815b59b0
[ 3.835292] s5 : ffffffff81d815e0 s6 : 0000000000000000 s7 : 0000000000000008
[ 3.836400] s8 : ffffffff81d8b0e0 s9 : 0000000000000000 s10: 0000000000000008
[ 3.836817] s11: ff5ffffffff78000 t3 : 0000000000000000 t4 : ffebfffefffef000
[ 3.837154] t5 : ffebfffefffef001 t6 : 0000000000000002
[ 3.837471] status: 0000000000000120 badaddr: ffebfffefffef000 cause: 000000000000000d
[ 3.837933] [<ffffffff801fa22c>] pcpu_alloc+0x494/0x838
[ 3.838324] [<ffffffff801fa5fc>] __alloc_percpu+0x14/0x1c
[ 3.838623] [<ffffffff800930fa>] __percpu_init_rwsem+0x1a/0x98
[ 3.838995] [<ffffffff80286c5a>] alloc_super+0x148/0x3da
[ 3.839631] [<ffffffff80287af2>] sget_fc+0x90/0x2c4
[ 3.840240] [<ffffffff80288498>] get_tree_nodev+0x24/0xa4
[ 3.840598] [<ffffffff801e919a>] shmem_get_tree+0x14/0x1c
[ 3.840891] [<ffffffff80286436>] vfs_get_tree+0x3a/0x11a
[ 3.841172] [<ffffffff802ba746>] path_mount+0x2ea/0xb44
[ 3.841468] [<ffffffff802bb69a>] sys_mount+0x1b2/0x270
[ 3.841790] [<ffffffff80003d1c>] ret_from_syscall+0x0/0x2
[ 3.842480] ---[ end trace 0000000000000000 ]---

Sure, I will try to fix this ASAP.

Are you willing to take this if the above issue is fixed ?

I still think the other interface is better, but it's taking too long
and some stuff is blocked on getting these to userspace so I'm fine with
it.

Replying to my own email here as I'm a bit mixed up right now from trying to convert over and I can't find Anup's response saying he can't reproduce the issues:

It looks like this was just bad luck, for some reason it happens these random failures started triggering for me right when I was merging the patch (I even went back and they went away). So I think these were just spurious failures. https://lore.kernel.org/all/20221009083050.3814850-1-panqinglin2020@xxxxxxxxxxx/ is out as a proposed fix, I've got it in my testing repo as it's making the crashes stop -- it smells a bit fishy, though, so I want to get some more time to look before merging it.

This one, though, is on for-next.

Thanks!



Regards,
Anup


>
> diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
> index fba9e9f46a8c..04bcc91c91ea 100644
> --- a/arch/riscv/kernel/cpu.c
> +++ b/arch/riscv/kernel/cpu.c
> @@ -3,10 +3,13 @@
> * Copyright (C) 2012 Regents of the University of California
> */
>
> +#include <linux/cpu.h>
> #include <linux/init.h>
> #include <linux/seq_file.h>
> #include <linux/of.h>
> +#include <asm/csr.h>
> #include <asm/hwcap.h>
> +#include <asm/sbi.h>
> #include <asm/smp.h>
> #include <asm/pgtable.h>
>
> @@ -64,6 +67,50 @@ int riscv_of_parent_hartid(struct device_node *node)
> }
>
> #ifdef CONFIG_PROC_FS
> +
> +struct riscv_cpuinfo {
> + unsigned long mvendorid;
> + unsigned long marchid;
> + unsigned long mimpid;
> +};
> +static DEFINE_PER_CPU(struct riscv_cpuinfo, riscv_cpuinfo);
> +
> +static int riscv_cpuinfo_starting(unsigned int cpu)
> +{
> + struct riscv_cpuinfo *ci = this_cpu_ptr(&riscv_cpuinfo);
> +
> +#if IS_ENABLED(CONFIG_RISCV_SBI)
> + ci->mvendorid = sbi_spec_is_0_1() ? 0 : sbi_get_mvendorid();
> + ci->marchid = sbi_spec_is_0_1() ? 0 : sbi_get_marchid();
> + ci->mimpid = sbi_spec_is_0_1() ? 0 : sbi_get_mimpid();
> +#elif IS_ENABLED(CONFIG_RISCV_M_MODE)
> + ci->mvendorid = csr_read(CSR_MVENDORID);
> + ci->marchid = csr_read(CSR_MARCHID);
> + ci->mimpid = csr_read(CSR_MIMPID);
> +#else
> + ci->mvendorid = 0;
> + ci->marchid = 0;
> + ci->mimpid = 0;
> +#endif
> +
> + return 0;
> +}
> +
> +static int __init riscv_cpuinfo_init(void)
> +{
> + int ret;
> +
> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "riscv/cpuinfo:starting",
> + riscv_cpuinfo_starting, NULL);
> + if (ret < 0) {
> + pr_err("cpuinfo: failed to register hotplug callbacks.\n");
> + return ret;
> + }
> +
> + return 0;
> +}
> +device_initcall(riscv_cpuinfo_init);
> +
> #define __RISCV_ISA_EXT_DATA(UPROP, EXTID) \
> { \
> .uprop = #UPROP, \
> @@ -178,6 +225,7 @@ static int c_show(struct seq_file *m, void *v)
> {
> unsigned long cpu_id = (unsigned long)v - 1;
> struct device_node *node = of_get_cpu_node(cpu_id, NULL);
> + struct riscv_cpuinfo *ci = per_cpu_ptr(&riscv_cpuinfo, cpu_id);
> const char *compat, *isa;
>
> seq_printf(m, "processor\t: %lu\n", cpu_id);
> @@ -188,6 +236,9 @@ static int c_show(struct seq_file *m, void *v)
> if (!of_property_read_string(node, "compatible", &compat)
> && strcmp(compat, "riscv"))
> seq_printf(m, "uarch\t\t: %s\n", compat);
> + seq_printf(m, "mvendorid\t: 0x%lx\n", ci->mvendorid);
> + seq_printf(m, "marchid\t\t: 0x%lx\n", ci->marchid);
> + seq_printf(m, "mimpid\t\t: 0x%lx\n", ci->mimpid);
> seq_puts(m, "\n");
> of_node_put(node);