Re: 2.6.4-rc1-mm1: queue-congestion-dm-implementation patch

From: Miquel van Smoorenburg
Date: Tue Mar 02 2004 - 08:05:16 EST


According to Kevin Corry:
> Article: 452202 of lists.linux.kernel
> From: Kevin Corry <kevcorry@xxxxxxxxxx>
> Date: Mon, 1 Mar 2004 14:00:50 -0600
> Newsgroups: lists.linux.kernel
> Approved: news@xxxxxxxxxxxxxxx
>
> The queue-congestion-dm-implementation patch in 2.6.4-rc1-mm1 (included below)
> allows a DM device to query the lower-level device(s) during the
> queue-congestion APIs. However, it must wait on an internal semaphore
> (md->lock) before passing the call down to the lower-level devices. The
> problem is that the higher-level caller is holding a spinlock at the same
> time. Here is one such stack-trace I have been seeing:
>
> Debug: sleeping function called from invalid context at include/linux/
> rwsem.h:43
> in_atomic():1, irqs_disabled():0
> Call Trace:
> [<c012456b>] __might_sleep+0xab/0xd0
> [<c03a0771>] dm_any_congested+0x21/0x60

Is there any way to reproduce this? I turned on spinlock and
sleep-spinlock debugging and did lots of I/O but I don't see it.

> I'm not sure how to fix this (or how serious a problem it is)...I'm just
> throwing this out there for discussion.

Well bdi_write_congested() is racy, ofcourse, so in theory a call
to make_request() could be made which can block (and for dm, locks
the same semaphore) - so this can happen anyway. Only the chance of
it is much lower. I'm not sure how bad it is.

Changing down_read() in dm_any_congested to down_read_trylock() would
probably fix it for bdi_*_congested(). If you can tell me how to
reproduce it I can try a few things..

Mike.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/