Re: CFQ: async queue blocks the whole system

From: Tao Ma
Date: Fri Jun 10 2011 - 06:00:47 EST


On 06/10/2011 05:14 PM, Vivek Goyal wrote:
> On Fri, Jun 10, 2011 at 01:48:37PM +0800, Tao Ma wrote:
>
> [..]
>>>> btw, reverting the patch doesn't work. I can still get the livelock.
>
> What test exactly you are running. I am primarily interested in whether
> you still get the hung task timeout warning where a writer is waiting on
> get_request_wait() for more than 120 secods or not.
>
> Livelock might be a different problem and for which Christoph provided
> a patch for XFS.
>
>>>
>>> Can you give following patch a try and see if it helps. On my system this
>>> does allow CFQ to dispatch some writes once in a while.
>> Sorry, this patch doesn't work in my test.
>
> Can you give me backtrace of say 15 seconds each with and without patch.
> I think now we must be dispatching some writes, that's a different thing
> that writer still sleeps more than 120 seconds because there are way
> too many readers.
>
> May be we need to look into show workload tree scheduling takes place and
> tweak that logic a bit.
OK, our test cases can be downloaded for free. ;)
svn co http://code.taobao.org/svn/dirbench/trunk/meta_test/press/set_vs_get
Modify run.sh to be fit for your need. Normally within 10 mins, you will
get the livelock. We have a SAS disk with 15000 RPMs.

btw, you have to mount the volume on /test since the test program are
not that clever. :)

Regards,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/