Re: [v3 PATCH 8/8] RISC-V: Assign hwcap only according to boot cpu.

From: Marc Zyngier
Date: Sat Feb 09 2019 - 11:11:58 EST


On Sat, 09 Feb 2019 04:26:07 +0000,
David Abdurachmanov <david.abdurachmanov@xxxxxxxxx> wrote:
>
> On Sat, Feb 9, 2019 at 12:03 AM Atish Patra <atish.patra@xxxxxxx> wrote:
> >
> > On 2/8/19 1:11 AM, Christoph Hellwig wrote:
> > >> + * We don't support running Linux on hertergenous ISA systems.
> > >> + * But first "okay" processor might not be the boot cpu.
> > >> + * Check the ISA of boot cpu.
> > >
> > > Please use up your available 80 characters per line in comments.
> > >
> > I will fix it.
> >
> > >> + /*
> > >> + * All "okay" hart should have same isa. We don't know how to
> > >> + * handle if they don't. Throw a warning for now.
> > >> + */
> > >> + if (elf_hwcap && temp_hwcap != elf_hwcap)
> > >> + pr_warn("isa mismatch: 0x%lx != 0x%lx\n",
> > >> + elf_hwcap, temp_hwcap);
> > >> +
> > >> + if (hartid == boot_cpu_hartid)
> > >> + boot_hwcap = temp_hwcap;
> > >> + elf_hwcap = temp_hwcap;
> > >
> > > So we always set elf_hwcap to the capabilities of the previous cpu.
> > >
> > >> + temp_hwcap = 0;
> > >
> > > I think tmp_hwcap should be declared and initialized inside the outer loop
> > > instead having to manually reset it like this.
> > >
> > >> + }
> > >>
> > >> + elf_hwcap = boot_hwcap;
> > >
> > > And then reset it here to the boot cpu.
> > >
> > > Shoudn't we only report the features supported by all cores? Otherwise
> > > we'll still have problems if the boot cpu supports a feature, but not
> > > others.
> > >
> >
> > Hmm. The other side of the argument is boot cpu does have a feature that
> > is not supported by other hart that didn't even boot.
> > The user space may execute something based on boot cpu capability but
> > that won't be enabled.
> >
> > At least, in this way we know that we are compatible completely with
> > boot cpu capabilities. Thoughts ?
>
> There is one example on the market, e.g., Samsung Exynos 9810.
>
> Mongoose 3 (big cores) only support ARMv8.0, while Cortex-A55
> (little ones) support ARMv8.2 (and that brings atomics support).
> I think, it's the only ARM SOC that supports different ISA extensions
> between cores on the same package.
>
> Kernel scheduler doesn't know that big cores are missing atomics
> support or that applications needs it and moves the thread
> resulting in illegal instruction.

Not quite. The scheduler doesn't have to know (thankfully).

The problem is that the Samsung folks tampered with the detection
logic in the kernel, and ended up advertising the LSE atomics to
userspace (despite only being available on half the cores).

If you run a mainline kernel on this things, it will just work, as the
LSE atomics are not advertised to userspace at all.

>
> E.g., see Golang issue: https://github.com/golang/go/issues/28431
>
> I also recall Jon Masters (Computer Architect at Red Hat) advocating
> against having cores with mismatched capabilities on the server
> market.

Well, nobody recommends that, server or not. That being said, it is
possible to handle it, and the arm64 kernel has been dealing with such
thing from day 1. We can have CPUs with different PMUs, implemented
page sizes, VA and PA spaces... What it takes is some work in the
kernel to sanitize it, and be careful in what you expose to userspace.

The thing to realise is that people will build stupid systems, no
matter how loud you shout. You can either pretend they don't exist, or
try to deal with them.

Thanks,

M.

--
Jazz is not dead, it just smell funny.