Re: [PATCH] md/raid5: fix locking in handle_stripe_clean_event()

From: Neil Brown
Date: Fri Oct 30 2015 - 18:17:07 EST


On Sat, Oct 31 2015, Shaohua Li wrote:

> On Fri, Oct 30, 2015 at 05:02:47PM +0300, Roman Gushchin wrote:
>> > Isn't the 4.1 fix just:
>> >
>> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> > index e5befa356dbe..6e4350a78257 100644
>> > --- a/drivers/md/raid5.c
>> > +++ b/drivers/md/raid5.c
>> > @@ -3522,16 +3522,16 @@ returnbi:
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ* no updated data, so remove it from hash list and the stripe
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ* will be reinitialized
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ*/
>> > - spin_lock_irq(&conf->device_lock);
>> > Âunhash:
>> > + spin_lock_irq(conf->hash_locks + sh->hash_lock_index);
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂremove_hash(sh);
>> > + spin_unlock_irq(conf->hash_locks + sh->hash_lock_index);
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (head_sh->batch_head) {
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsh = list_first_entry(&sh->batch_list,
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂstruct stripe_head, batch_list);
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (sh != head_sh)
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂgoto unhash;
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ}
>> > - spin_unlock_irq(&conf->device_lock);
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsh = head_sh;
>> >
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (test_bit(STRIPE_SYNC_REQUESTED, &sh->state))
>> >
>> > ??
>>
>> In my opion, this patch looks correct, although it seems to me, that there is an another issue here.
>>
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (head_sh->batch_head) {
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsh = list_first_entry(&sh->batch_list,
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂstruct stripe_head, batch_list);
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif (sh != head_sh)
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂgoto unhash;
>> > ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ}
>>
>> With a patch above this code will be executed without taking any locks. It it correct?
>> In my opinion, we need to take at least sh->stripe_lock, which protects sh->batch_head.
>> Or do I miss something?
>>
>> If you want, we can handle this issue separately.
>
> The batch_list list doesn't need the protection. Only the remove_hash() need it.

Yes, that's my understanding too. The key to understanding is that
comment you (helpfully!) put in clear_batch_ready():

/*
* BATCH_READY is cleared, no new stripes can be added.
* batch_list can be accessed without lock
*/

I'll wrangle some patches...

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature