Re: Is it a workqueue related issue in 2.6.37 (Was: Re: [libvirt]blkio cgroup [solved])

From: Tejun Heo
Date: Fri Feb 25 2011 - 08:18:59 EST


Hello,

On Fri, Feb 25, 2011 at 12:46:16PM +0100, Dominik Klein wrote:
> With 2.6.37 (also tried .1 and .2) it does not work but end up like I
> documented. With 2.6.38-rc1, it does work. With deadline scheduler, it
> also works in 2.6.37.

Okay, here's the problematic part.

<idle>-0 [013] 1640.975562: workqueue_queue_work: work struct=ffff88080f14f270 function=blk_throtl_work workqueue=ffff88102c8fc700 req_cpu=13 cpu=13
<idle>-0 [013] 1640.975564: workqueue_activate_work: work struct ffff88080f14f270
<...>-477 [013] 1640.975574: workqueue_execute_start: work struct ffff88080f14f270: function blk_throtl_work
<idle>-0 [013] 1641.087450: workqueue_queue_work: work struct=ffff88080f14f270 function=blk_throtl_work workqueue=ffff88102c8fc700 req_cpu=13 cpu=13

The workqueue is per-cpu, so we only need to follow cpu=13 cases.
@1640, blk_throtl_work() is queued, activated and starts executing but
never finishes. The same work item is never executed more than once
at the same on the same CPU, so when the next work item is queued, it
doesn't get activated until the previous execution is complete.

The next thing to do would be finding out why blk_throtl_work() isn't
finishing. sysrq-t or /proc/PID/stack should show us where it's
stalled.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/