Re: {interruptible_,}sleep_on question

Manfred Spraul (manfreds@colorfullife.com)
Mon, 01 Nov 1999 23:46:29 +0100


"Leonard N. Zubkoff" wrote:
>
> Date: Mon, 1 Nov 1999 17:44:14 +0000 (GMT)
> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
>
> I think the DAC960 is holding the io_request_lock and its own spinlocks
> appropriately. I may be wrong here.
>
> I certainly believe it is. The DAC960 driver drops io_request_lock before
> calling sleep_on.
It must drop the lock, and that's the problem: I'm quite sure that the
current code could lock up: (The window is tiny, but nevertheless it
should be closed.)

CPU1: in DAC960_ProcessRequest(), owns the io_request_lock.
CPU2: in DAC960_InterruptHandler

AcquireControllerLock: spins because the io_request_lock is blocked.

DAC960_AllocateCommand(): returns NULL, no free Command struct
spin_unlock(&io_request_lock);
<the local NMI interrupt occurs and enlarges the window between
spin_unlock() and sleep_on()>
DAC960_InteruptHandler can continue.
calls DAC960_ProcessCompletedCommand().
<at the end of this function:>
DAC960_DeallocateCommand();
wake_up(&Controller->CommandWaitQueue);
<the wait queue is empty, nothing to do>
<the local NMI returns>
sleep_on()
<adds this thread to the wait queue>
<sets current->state=TASK_UNINTERRUPTIBLE>
<sleeps>

As you'll notice, the wakeup from CPU2 is lost, the free command
structure is not used. [At least this is what I understand from
the DAC driver, please correct me if I'm wrong.]

I think you should replace
- spin_unlock(&io_request_lock);
- sleep_on(&CommandWaitQueue);
- spin_lock_irq(&io_request_lock);

with something like:
+ add_wait_queue(&CommandWaitQueue);
+ current->state= TASK_UNINTERRUPTIBLE;
+ spin_unlock(&io_request_lock);
+ schedule();
+ remove_wait_queue(&CommandWaitQueue);
+ spin_lock_irq(&io_request_lock);

A sample function would be __get_request_wait() in ll_rw_blk.c, or
__wait_on_buffer() in fs/buffer.c

--
	Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/