Re: [PATCH v2] tools: memory-model: Make plain accesses carry dependencies

From: Boqun Feng
Date: Tue Dec 06 2022 - 20:44:27 EST


On Mon, Dec 05, 2022 at 01:42:46PM +0000, Jonas Oberhauser wrote:
[...]
> > Besides, could you also explain a little bit why only "data;rfi" can
> > be "carry-dep" but "ctrl;rfi" and "addr;rfi" cannot? I think it's
> > because there are special cases when compilers can figure out a
> > condition being true or an address being constant therefore break
> > the dependency
>
> Oh, good question. A bit hard for me to write down the answer clearly
> (which some people will claim that I don't understand it well myself,
> but I beg to differ :) :( :) ).
>
> In a nutshell, it's because x ->data [Plain] ->rfi y ->... z
> fulfils the same role as storing something in a register and then
> using it in a subsequent computation; x ->ctrl y ->... z (and ->addr)
> don't. So it's not due to smart compilers, just the fact that the
> other two cases seem unrelated to the problem being solved, and
> including them might introduce some unsoundness (not that I have
> checked if they do).
>
> Mathematically I imagine the computation graph between register
> accesses/computations r_1,...,r_n and memory accesses m_1,...m_k that
> has an unlabeled edge m_i -> r_j when m_i loads some data that is an
> input to the access/computation r_j, similarly it has an unlabeled
> edge r_i -> r_j when the result of r_i is used as an input of r_j,
> and finally r_i ->data/ctrl/addr m_j when the value computed by r_j is
> the address/data or affects the ctrl of m_j. So for example, if 'int
> a' were to declare a register, then
> int a = READ_ONCE(&x);
> if (a == 5) { WRITE_ONCE(&y,5); }
> would have something like (*) 4 events (*= since I'm avoiding fully
> formalizing the graph model here I don't really define to which extent
> one splits up register reads, computations, etc., so I'll do it
> somewhat arbitrarily)
> m_1 = READ_ONCE(&x)
> r_1 = store the result of m_1 in a
> r_2 = compare the content of a to 5
> m_2 = WRITE_ONCE(&y, 5)
>
> with the edges m_1 -> r_1 -> r_2 ->ctrl m_2
>
> Then in the LKMM graph, we would have m_i ->ctrl m_j (or data or addr)
> iff there is a path m_i -> r_x1 -> ... -> r_xl ->ctrl m_j (or data or
> addr).
> So in the example above, m_1 ->ctrl m_2.
>
> Now what we would do is look at what happens if the compiler/a
> tool/me/whatever interprets 'int a' as a memory location. Instead of
> r_1 and r_2 from above, we would have something like
>
> m_1 = READ_ONCE(&x)
> r_3 = the result of m_1 (stored in a CPU register)
> m_3 = a := the r_3 result
> m_4 = load from a
> r_4 = the result of m_4 (stored in a CPU register)
> m_2 = WRITE_ONCE(&y, 5)
>
> with m_1 -> r_3 ->data -> m_3 ->rfi m_4 -> r_4 ->ctrl m_2
> and hence
> m_1 ->data m_3 ->rfi m_4 ->ctrl m_2
>
> What the patch does is make sure that even under this interpretation,
> we still have m_1 ->ctrl m_2
> Note that this ->data ->rfi applies in every case where something is
> considered a register is turned into a memory location, because those
> cases always introduce a store with a fixed address (that memory
> location) where the data is the current content of the register, which
> is then read (internally) at the next time that data is picked up. A
> store to register never becomes a ctrl or addr edge, hence they are
> not included in the patch.
>

Let me see if I can describe your approach in a more formal way (look at
me, pretending to know formal methods ;-))

Bascially, what you are saying is that for every valid litmus test with
dependencies (carried by local variables i.e. registers), there exists a
way to rewrite the litmus test by treating all (or any number of) local
variables as memory locations.

Lets say the set of the litmus tests to start with is L and the set of
the rewritten ones is L'is. Here is a rewrite rule I come up with:

1) for every local named <n> on processor <P>, add an extra memory
location (maybe named <n>_<P>) in the processor function, for
example

P0(int *x, int *y) {
int a ...;
}

became
P0(int *x, int *y, int *a_P0) {
}

2) for each <n> in 1), for each assignment of <n>, say:

int <n> = <expr>; // H

or

<n> = <expr>; // H

rewrite it to

int computer_register_<seq> = <expr>; // A
*<n>_<P> = computer_register_<seq>; // B

where <seq> is a self increasing counter that increases every
step or the rewrite.

This step introduces an extra edge A ->data B.

3) for each <n> in 1), for each the read of <n>, say:

Expr; // T

rewrite it to

int computer_register_<seq> = *<n>_<P>; // C
Expr[<n>/computer_register_<seq>]; // D

where <seq> is a self increasing counter that increases every
step or the rewrite.

"M[x/y]" means changing the expression by replace all appearance
of variable x into y, e.g. (n1 = n + 1)[n/y] is n1 = y + 1.

Note that for every rewrite 3), there must exists a
corresponding rewrite 2), that C reads the value stored by the
B of the rewrite 2). This is because the litmus tests in set L
are valid, all variables must be assigned a value before used.

This step introduces an extra edge B ->rfi C, and also for every
dependency between H -> T in the old litmus test, it preserves
between C -> D in the new litmus test.

4) for each <n> and <P> in 1), for each expression in the exist
clause, say:

Expr

rewrite it to

Expr[<P>:<n>/<n>_<P>]

For example, by these rules, the following litmus test:

P0(int *x, int *y) {
int r = READ_ONCE(*x);
WRITE_ONCE(*y, r);
}
exists(0:r = 1)

is rewritten into

P0(int *x, int *y, int *r_P0) {
int computer_register_0 = READ_ONCE(*x); // by 2)
*r_P0 = computer_register_0; // by 2)
int computer_register_1 = *r_P0; // by 3)
WRITE_ONCE(*y, computer_register_1); // by 3)
}
exists(r_P0 = 1) // by 4)


It's obvious that the rewritten litmus tests are valid, and for every
dependency

H ->dep T in litmus tests of set L

there must exists a

A ->data ->rfi ->dep D in L'

("dep" is not the same as in linux-kernel.cat).

And note since L' is a set of valid litmus tests, we can do another
whole rewrite to L' and get L'' which will contains
->data->rfi->data->rfi->dep, and this covers the (data;rfi)* ;-)


Now the hard part, since the rewrite is only one-direction, i.e L => L',
or more preciselly the transverse closure of the rewrite is
one-direction. We can only prove that if there exists a valid litmus
test with dependency ->dep, we can construct a number of litmus tests
with (->data ->rfi)*->dep.

But to accept the new rules in your patch, we need the opposite
direction, that is, if there exists a (->data ->rfi)* ->dep, we can
somehow find a litmus test that preserves anything else but reduce

(->data->rfi)*->dep
into

->dep.

(I'm not sure the following is sound, since some more work is needed)

The first observation is that we can ignore ->data [Marked] ->rfi,
because if we can reduce

(->data [Plain] ->rfi)* ->data [Marked] ->rfi -> (->data [Plain] ->rfi)* ->dep

to

->data [Marked] ->rfi ->dep

we get "hb", which already has the ordering we want.

Now we can focus on ->data [Plain] ->rfi ->dep, e.g.

x = READ_ONCE(..); // A
*<N> = <expr>; // B, contains x
r = Expr; // C, Expr contains *<N>
WRITE_ONCE(.., r); // D

We need to assume that the litmus tests are data-race free, since we
only care about valid litmus tests, then it's OK. The rewritten looks
like the following:

int computer_register_<seq>;
x = READ_ONCE(..); // A
computer_register = <expr>; // B, contains x
// ^^^ basically reverse rewrite of the 2) above

r = Expr[*<N>/computer_register_<seq>]; // C
// ^^^ basically reverse rewrite of the 3) above

*<N> = computer_register_<seq>; // R
// ^^^ need this to keep the global memory effect

WRITE_ONCE(.., r); // D

We need the data-race free assumption to replace a memory cell into a
local variable.

By this rewrite, We reduce
A ->data B ->rfi C ->dep D
into
A ->dep D.

A few details are missing here, for example we may have multip Cs and
need to rewrite all of them, and we need normalization work like
converting litmus tests into canonical forms (e.g change the += into
temporary variables and assignment). Also we need to prove that no more
edges (or edges we care) are added. But these look obvious to me, and
enough formally digging for me today ;-)

As a result, the rewrite operation on L is proved to be isomorphic! ;-)
In the sense that we think reducing to (->data [Marked] ->rfi)* ->dep
is good enough ;-)

Now for every litmus test with

(->data ->rfi)* ->dep

we can rewrite it into another litmus test preserving all other edges
except replacing the above with

(->data [Marked] ->rfi)* ->dep

and the rewrite only contains replacing non-data-race accesses with
local variables and in the sense of C standards and maybe other model
tools these litmus tests are equivalent.

> Please let me know if this is convincing and clear. If so we could
> link to your e-mail and my response to provide context, without having
> to put a huge explanation into the commit message. If that's not
> desirable I can also think about how to compress this into a shorter
> explanation.
>

My intution now is the rewrite explanation above, which is good enough
for me but not sure for other audience. Thank you for keep explaining
this to me ;-)


Regards,
Boqun

> Best wishes,
> Jonas
>
> PS:
>
> > sometimes I'm just slow to understand things ;-)
>
> Well, welcome in the club :D