Re: [BUG][2.5.66bk9+] - tty hangings - patches, dmesg & sysctl+T info

From: Andrew Morton (akpm@digeo.com)
Date: Tue Apr 08 2003 - 23:56:51 EST


Roland Dreier <roland@topspin.com> wrote:
>
> Still, I like the idea of this patch, since it resolves the livelock.
> But I don't think the implementation is quite right. insert_sequence
> doesn't get incremented until delayed_work_timer_fn(). That means
> that a driver (tty_io.c, for example) could call
> schedule_delayed_work(), then call flush_scheduled_work() before
> delayed_work_timer_fn() has run for that work.

The driver needs to run cancel_delayed_work() before calling
flush_scheduled_work(). The tty patch is already doing that, and I think
that plugs the holes.

Here's a full changelog.

The workqueue code currently has a notion of a per-cpu queue being "busy".
flush_scheduled_work()'s responsibility is to wait for a queue to be not busy.

Problem is, flush_scheduled_work() can easily hang up.

- The workqueue is deemed "busy" when there are pending (timer-based)
  works. But if someone repeatedly schedules new work in the delayed
  callback, the queue will never fall idle, and flush_scheduled_work() will
  not complete.

- If someone reschedules work (not delayed work) in the work function, that
  too will cause the queue to never go idle, and flush_scheduled_work() will
  not terminate.

So what this patch does is:

- Create a new "cancel_delayed_work()" which will try to kill off any
  timer-based works.

- Change flush_scheduled_work() so that it is immune to people re-adding
  work in the work callout handler.

  We can do this by recognising that the caller does *not* want to wait
  until the workqueue is "empty". The caller merely wants to wait until all
  works which were pending at the time flush_scheduled_work() was called have
  completed.

  The patch uses a couple of sequence numbers for that.

So now, if someone wants to reliably remove delayed work they should do:

        /*
         * Make sure that my work-callback will no longer schedule new work
         */
        my_driver_is_shutting_down = 1;

        /*
         * Kill off any pending delayed work
         */
        cancel_delayed_work(&my_work);

        /*
         * OK, there will be no new works scheduled. But there may be one
         * currently queued or in progress. So wait for that to complete.
         */
        flush_scheduled_work();

And change the flush_workqueue() sleep to be uninterruptible. We worry that
delivery of a signal may cause the wait to return too early.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Apr 15 2003 - 22:00:17 EST