Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

From: Patrick Marlier
Date: Sat May 16 2015 - 13:43:06 EST




On 05/13/2015 04:58 AM, NeilBrown wrote:
On Tue, 12 May 2015 22:38:53 -0400 Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

On Tue, 12 May 2015 15:46:26 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

From: Patrick Marlier <patrick.marlier@xxxxxxxxx>

Signed-off-by: Patrick Marlier <patrick.marlier@xxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
---
drivers/md/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 2bc56e2a3526..32901772e4ee 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -181,7 +181,7 @@ static struct md_rdev *next_active_rdev(struct md_rdev *rdev, struct mddev *mdde
rcu_read_lock();
if (rdev == NULL)
/* start at the beginning */
- rdev = list_entry_rcu(&mddev->disks, struct md_rdev, same_set);
+ rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);

Hmm, this changes the semantics.

The original code looks nasty, I first thought it was broken, but it
seems to work out of sheer luck (or clever hack)

Definitely a clever hack - no question of "luck" here :-)

It might makes sense to change it to use list_for_each_entry_from_rcu()

if (rdev == NULL)
rdev = list_entry_rcu(mddev->disks.next, struct md_rdev, same_set);
else {
rdev_dec_pending(rdev, mddev);
rdev = list_next_entry_rcu(rdev->same_set.next, struct md_rdev, same_set);
}
list_for_each_entry_from_rcu(rdev, ....)

but there isn't a "list_next_entry_rcu"....


Also, it would have been polity to at least 'cc' them Maintainer of this code
in the original patch - no?

Sure my bad. I hesitated to CC maintainers. I was almost sure that it will be rejected so I wanted to avoid noise.



Thanks,
NeilBrown


else {
/* release the previous rdev and start from there. */
rdev_dec_pending(rdev, mddev);


What comes after this is:

list_for_each_entry_continue_rcu(rdev, &mddev->disks, same_set) {
if (rdev->raid_disk >= 0 &&

Now the original code had:

rdev = list_entry_rcu(&mddev->disks, struct md_rdev, same_set);

Where &mddev->disks would return the address of the disks field of
mddev which is a list head. Then it would get the 'same_set' offset,
which is 0, and rdev is pointing to a makeshift md_rdev struct. But it
isn't used, as the list_for_each_entry_continue_rcu() has:

#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

Thus the first use of pos is pos->member.next or:

mddev->disks.next

But now you converted it to rdev = mddev->disks.next, which means the
first use is:

pos = mddev->disks.next->next

I think you are skipping the first element here.


struct mddev {
...
struct list_head disks;
...}

struct list_head {
struct list_head *next, *prev;
};

The tricky thing is that "list_entry_rcu" before and after the patch is reading the same thing.

However in your case, the change I proposed is probably wrong I trust you on this side. :) What's your proposal to fix it with the rculist patch?

PS: In the rculist patch I proposed, I avoid the store and the atomic reload in the stack variable __ptr. (yeap, the rcu_dereference_raw/ACCESS_ONCE is a bit confusing because it implicitly do & on the parameter).

Thanks.
--
Pat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/