Re: [PATCH] arm64: Add the arm64.nolse_atomics command line option

From: Aiqun(Maria) Yu
Date: Wed Jul 12 2023 - 04:04:14 EST


On 7/12/2023 3:29 PM, Marc Zyngier wrote:
On Wed, 12 Jul 2023 03:47:55 +0100,
"Aiqun(Maria) Yu" <quic_aiquny@xxxxxxxxxxx> wrote:

On 7/11/2023 6:38 PM, Marc Zyngier wrote:
On Tue, 11 Jul 2023 11:12:48 +0100,
"Aiqun(Maria) Yu" <quic_aiquny@xxxxxxxxxxx> wrote:

For the KVM part, per my understanding, as long as the current feature
id being overriden, the KVM system also get the current vcpu without
the lse atomic feature enabled.
KVM vcpu will read the sys reg from host arm64_ftr_regs which is
already been controled by the idreg_overrides.

You're completely missing the point.

The guest is free to map memory as non-cacheable *and* to use LSE
atomics even if the idregs pretend this is not available. At which
The guest also can have the current linux kernel mechanism of LSE
ATOMIC way.

[snip useless diagrams]

Yes, the guest can do the right thing. The guest, a totally
unprivileged piece of SW, can also ignore the idregs and take the
whole machine down because your HW is broken.

if the guest ignore the idregs, it is not supported by the current Linux KVM id reg emulation as well. The similar rule is applied to other cpu feature as well.

So it can be an expected machine down because of this.

We want to support/utilize the current HW with current inline runtime patching for lse atomic ops.
Just like other KVM vcpu cpu features, lse atomic can be a feature
inherit from the pysical cpu features for the KVM vcpus.

See above. Your reasoning applies to a well behaved guest, which is
the *wrong* way to reason about these things.

The feature supported is not always that *freely* even for current cpu features as well.
Our current target is that the software can utilize the HW as best as software can.

The current HW can be possible with Generic common Image with other cpu which support lse atomic. So the Image can have inline runtime patching for lse atomic operations. And from software side it can have option to support this.

For example, for current newer memory controller the far lse atomic operations is supported, and the atomic operation is not limited to non-cached memory mapping as well.

Also the lse atomic instead of FWB performance in specific scenarios can be different with current hardware design as well.

we are trying to do possible improvement with HW design change instead of ruin it.
Feel free to comment if it is not same understanding.

M.


--
Thx and BRs,
Aiqun(Maria) Yu