Re: [PATCH v4 11/11] perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}

From: Leo Yan
Date: Tue Jul 13 2021 - 11:46:12 EST


Hi Russell,

On Mon, Jul 12, 2021 at 03:44:11PM +0100, Russell King (Oracle) wrote:
> On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
> > When perf runs in compat mode (kernel in 64-bit mode and the perf is in
> > 32-bit mode), the 64-bit value atomicity in the user space cannot be
> > assured, E.g. on some architectures, the 64-bit value accessing is split
> > into two instructions, one is for the low 32-bit word accessing and
> > another is for the high 32-bit word.
>
> Does this apply to 32-bit ARM code on aarch64? I would not have thought
> it would, as the structure member is a __u64 and
> compat_auxtrace_mmap__read_head() doesn't seem to be marking anything
> as packed, so the compiler _should_ be able to use a LDRD instruction
> to load the value.

I think essentially your question is relevant to the memory model.
For 32-bit Arm application on aarch64, in the Armv8 architecture
reference manual ARM DDI 0487F.c, chapter "E2.2.1
Requirements for single-copy atomicity" describes:

"LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM,
and VSTR instructions are executed as a sequence of word-aligned word
accesses. Each 32-bit word access is guaranteed to be single-copy
atomic. The architecture does not require subsequences of two or more
word accesses from the sequence to be single-copy atomic."

So I think LDRD/STRD instruction cannot promise the atomicity for
loading or storing two words in 32-bit Arm.

And another thought is the functions compat_auxtrace_mmap__read_head()
is a general function, I avoid to write it with any architecture
specific instructions.

> Is this a problem noticed on non-ARM architectures?

No, actually we just concluded the potential issue based on the analysis
for the weak memory model.

Thanks,
Leo