Re: [PATCH v3 3/6] rculist: Add hlist_swap_before_rcu

From: Linus Torvalds
Date: Sun Apr 26 2020 - 13:40:40 EST


On Sun, Apr 26, 2020 at 7:14 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>
> To support this add hlist_swap_before_rcu. An hlist primitive that
> will allow swaping the leading sections of two tasks. For exchanging
> the task pids it will just be swapping the hlist_heads of two single
> entry lists. But the functionality is more general.

So I have no problems with the series now - the code is much more
understandable. Partly because of the split-up, partly because of the
comments, and partly because you explained the special case and why it
was a valid thing to do...

However, I did start thinking about this case again.

I still don't think the "swap entry" macro is necessarily useful in
_general_ - any time it's an actual individual entry, that swap macro
doesn't really work.

So the only reason it works here is because you're actually swapping
the whole list.

But that, in turn, shouldn't be using that "first node" model at all,
it should use the hlist_head. That would have made it a lot more
obvious what is actually going on to me.

Now, the comment very much talks about the head case, but the code
still looks like it's swapping a non-head thing.

I guess the code technically _works_ with "swap two list ends", but
does that actually really make sense as an operation?

So I no longer hate how this patch looks, but I wonder if we should
just make the whole "this node is the *first* node" a bit more
explicit in both the caller and in the swapping code.

It could be as simple as replacing just the conceptual types and
names, so instead of some "pnode1" double-indirect node pointer, we'd
have

struct hlist_head *left_head = container_of(left->pprev,
struct hlist_head, first);
struct hlist_head *right_head = container_of(right->pprev,
struct hlist_head, first);

and then the code would do

rcu_assign_pointer(right_head->first, left);
rcu_assign_pointer(left_head->first, right);
WRITE_ONCE(left->pprev, &right_head->first);
WRITE_ONCE(right->pprev, &left_head->first);

which should generate the exact same code, but makes it clear that
what we're doing is switching the whole hlist when given the first
entries.

Doesn't that make what it actually does a lot more understandable? The
*pnode1/pnode2 games are somewhat opaque, but with that type and name
change and using "container_of()", the code now fairly naturally reads
as "oh, we're changing the first pointers in the list heads, and
making the nodes point back to them" .

Again - the current function _works_ with swapping two hlists in the
middle (not two entries - it swaps the whole list starting at that
entry!), so your current patch is in some ways "more generic". I'm
just suggesting that the generic case doesn't make much sense, and
that the "we know the first entries, swap the lists" actually is what
the real use is, and writing it as such makes the code easier to
understand.

But I'm not going to insist on this, so this is more an RFC. Maybe
people disagree, and/or have an actual use case for that "break two
hlists in the middle, swap the ends" that I find unlikely...

(NOTE: My "convert to hlist_head" code _works_ for that case too
because the code generation is the same! But it would be really really
confusing for that code to be used for anything but the first entry).

Linus