Re: [WIP 0/3] Memory model and atomic API in Rust

From: Alan Stern
Date: Sun Mar 24 2024 - 11:22:53 EST

Next message: Andrew Lunn: "Re: [EXTERNAL] Re: [PATCH v2 0/5] Fix prestera driver fail to probe twice"
Previous message: Regzbot (on behalf of Thorsten Leemhuis): "Linux regressions report for mainline [2024-03-24]"
Next in thread: comex: "Re: [WIP 0/3] Memory model and atomic API in Rust"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Mar 23, 2024 at 05:40:23PM -0400, comex wrote:
> That may be true, but the LLVM issue you cited isn’t a good example.
> In that issue, the function being miscompiled doesn’t actually use any
> barriers or atomics itself; only the scaffolding around it does. The
> same issue would happen even if the scaffolding used LKMM atomics.
>
> For anyone curious: The problematic optimization involves an
> allocation (‘p’) that is initially private to the function, but is
> returned at the end of the function. LLVM moves a non-atomic store to
> that allocation across an external function call (to ‘foo’). This
> reordering would be blatantly invalid if any other code could observe
> the contents of the allocation, but is valid if the allocation is
> private to the function. LLVM assumes the latter: after all, the
> pointer to it hasn’t escaped. Yet. Except that in a weak memory
> model, the escape can ‘time travel’...

It's hard to understand exactly what you mean, but consider the
following example:

int *globalptr;
int x;

int *f() {
int *p = kzalloc(sizeof(int));

L1: *p = 1;
L2: foo();
return p;
}

void foo() {
smp_store_release(&x, 2);
}

void thread0() {
WRITE_ONCE(globalptr, f());
}

void thread1() {
int m, n;
int *q;

m = smp_load_acquire(&x);
q = READ_ONCE(globalptr);
if (m && q)
n = *q;
}

(If you like, pretend each of these function definitions lives in a
different source file -- it doesn't matter.)

With no optimization, whenever thread1() reads *q it will always obtain
1, thanks to the store-release in foo() and the load-acquire() in
thread1(). But if the compiler swaps L1 and L2 in f() then this is not
guaranteed. On a weakly ordered architecture, thread1() could then get
0 from *q.

I don't know if this is what you meant by "in a weak memory model, the
escape can ‘time travel'". Regardless, it seems very clear that any
compiler which swaps L1 and L2 in f() has a genuine bug.

Alan Stern

Next message: Andrew Lunn: "Re: [EXTERNAL] Re: [PATCH v2 0/5] Fix prestera driver fail to probe twice"
Previous message: Regzbot (on behalf of Thorsten Leemhuis): "Linux regressions report for mainline [2024-03-24]"
Next in thread: comex: "Re: [WIP 0/3] Memory model and atomic API in Rust"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]