Re: [RFC PATCH v5 5/5] riscv/cmpxchg: Implement xchg for variables of size 1 and 2

From: Guo Ren
Date: Thu Aug 10 2023 - 21:40:48 EST

Next message: Guo Samin: "Re: [-next v1 0/1] Fix StarFive JH7110 gmac TCP RX speed issue"
Previous message: Daniel Golle: "[PATCH v4 8/8] mtd: ubi: provide NVMEM layer over UBI volumes"
In reply to: Guo Ren: "Re: [RFC PATCH v5 5/5] riscv/cmpxchg: Implement xchg for variables of size 1 and 2"
Next in thread: Conor Dooley: "Re: [RFC PATCH v5 5/5] riscv/cmpxchg: Implement xchg for variables of size 1 and 2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Aug 11, 2023 at 12:23 AM Palmer Dabbelt <palmer@xxxxxxxxxxxx> wrote:
>
> On Thu, 10 Aug 2023 09:04:04 PDT (-0700), leobras@xxxxxxxxxx wrote:
> > On Thu, 2023-08-10 at 08:51 +0200, Arnd Bergmann wrote:
> >> On Thu, Aug 10, 2023, at 06:03, Leonardo Bras wrote:
> >> > xchg for variables of size 1-byte and 2-bytes is not yet available for
> >> > riscv, even though its present in other architectures such as arm64 and
> >> > x86. This could lead to not being able to implement some locking mechanisms
> >> > or requiring some rework to make it work properly.
> >> >
> >> > Implement 1-byte and 2-bytes xchg in order to achieve parity with other
> >> > architectures.
> >> >
> >> > Signed-off-by: Leonardo Bras <leobras@xxxxxxxxxx>
> >>
> >
> > Hello Arnd Bergmann, thanks for reviewing!
> >
> >> Parity with other architectures by itself is not a reason to do this,
> >> in particular the other architectures you listed have the instructions
> >> in hardware while riscv does not.
> >
> > Sure, I understand RISC-V don't have native support for xchg on variables of
> > size < 4B. My argument is that it's nice to have even an emulated version for
> > this in case any future mechanism wants to use it.
> >
> > Not having it may mean we won't be able to enable given mechanism in RISC-V.
>
> IIUC the ask is to have a user within the kernel for these functions.
> That's the general thing to do, and last time this came up there was no
> in-kernel use of it -- the qspinlock stuff would, but we haven't enabled
> it yet because we're worried about the performance/fairness stuff that
> other ports have seen and nobody's got concrete benchmarks yet (though
> there's another patch set out that I haven't had time to look through,
> so that may have changed).
Conor doesn't agree with using an alternative as a detour mechanism
between qspinlock & ticket lock. So I'm preparing V11 with static_key
(jump_label) style. Next version, I would separate paravirt_qspinlock
& CNA_qspinlock from V10. That would make it easy to review the
qspinlock patch series. You can review the next version V11. Now I'm
debugging a static_key init problem when load_modules, which is
triggered by our combo_qspinlock.

The qspinlock is being tested on the riscv platform [1] with 128 cores
with 8 NUMA nodes, next, I would update the comparison results of
qspinlock & ticket lock.

[1]: https://www.sophon.ai/

>
> So if something uses these I'm happy to go look closer.
>
> >> Emulating the small xchg() through cmpxchg() is particularly tricky
> >> since it's easy to run into a case where this does not guarantee
> >> forward progress.
> >>
> >
> > Didn't get this part:
> > By "emulating small xchg() through cmpxchg()", did you mean like emulating an
> > xchg (usually 1 instruction) with lr & sc (same used in cmpxchg) ?
> >
> > If so, yeah, it's a fair point: in some extreme case we could have multiple
> > threads accessing given cacheline and have sc always failing. On the other hand,
> > there are 2 arguments on that:
> >
> > 1 - Other architectures, (such as powerpc, arm and arm64 without LSE atomics)
> > also seem to rely in this mechanism for every xchg size. Another archs like csky
> > and loongarch use asm that look like mine to handle size < 4B xchg.
> >
> >
> >> This is also something that almost no architecture
> >> specific code relies on (generic qspinlock being a notable exception).
> >>
> >
> > 2 - As you mentioned, there should be very little code that will actually make
> > use of xchg for vars < 4B, so it should be safe to assume its fine to not
> > guarantee forward progress for those rare usages (like some of above mentioned
> > archs).
> >
> >> I would recommend just dropping this patch from the series, at least
> >> until there is a need for it.
> >
> > While I agree this is a valid point, I believe its more interesting to have it
> > implemented if any future mechanism wants to make use of this.
> >
> >
> > Thanks!
> > Leo

--
Best Regards
Guo Ren

Next message: Guo Samin: "Re: [-next v1 0/1] Fix StarFive JH7110 gmac TCP RX speed issue"
Previous message: Daniel Golle: "[PATCH v4 8/8] mtd: ubi: provide NVMEM layer over UBI volumes"
In reply to: Guo Ren: "Re: [RFC PATCH v5 5/5] riscv/cmpxchg: Implement xchg for variables of size 1 and 2"
Next in thread: Conor Dooley: "Re: [RFC PATCH v5 5/5] riscv/cmpxchg: Implement xchg for variables of size 1 and 2"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]