Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements

From: Bart Van Assche
Date: Mon Jan 30 2017 - 19:31:21 EST


On Wed, 2017-01-18 at 10:48 +0100, Hannes Reinecke wrote:
> @@ -1488,26 +1487,13 @@ static unsigned long disk_events_poll_jiffies(struct gendisk *disk)
>  void disk_block_events(struct gendisk *disk)
>  {
>         struct disk_events *ev = disk->ev;
> -       unsigned long flags;
> -       bool cancel;
>  
>         if (!ev)
>                 return;
>  
> -       /*
> -        * Outer mutex ensures that the first blocker completes canceling
> -        * the event work before further blockers are allowed to finish.
> -        */
> -       mutex_lock(&ev->block_mutex);
> -
> -       spin_lock_irqsave(&ev->lock, flags);
> -       cancel = !ev->block++;
> -       spin_unlock_irqrestore(&ev->lock, flags);
> -
> -       if (cancel)
> +       if (atomic_inc_return(&ev->block) == 1)
>                 cancel_delayed_work_sync(&disk->ev->dwork);
>  
> -       mutex_unlock(&ev->block_mutex);
>  }

Hello Hannes,

I have already encountered a few times a deadlock that was caused by the
event checking code so I agree with you that it would be a big step forward
if such deadlocks wouldn't occur anymore. However, this patch realizes a
change that has not been described in the patch description, namely that
disk_block_events() calls are no longer serialized. Are you sure it is safe
to drop the serialization of disk_block_events() calls?

Thanks,

Bart.