Re: deadlock in lru_add_drain ? (3.14rc5)

From: Dave Jones
Date: Mon Mar 10 2014 - 16:15:58 EST


On Mon, Mar 10, 2014 at 04:09:57PM -0400, Tejun Heo wrote:

> Hmmm... this is puzzling. At least according to the slightly
> truncated (pids < 13) sysrq-t output, there's no kworker running
> lru_add_drain_per_cpu() and nothing blocked on lru_add_drain_all::lock
> can introduce any complex dependency. Also, at least from glancing
> over, I don't see anything behind lru_add_rain_per_cpu() which can get
> involved in a complex dependency chain.
>
> Assuming that the handful lost traces didn't reveal serious ah-has, it
> almost looks like workqueue either failed to initiate execution of a
> queued work item or flush_work() somehow got confused on a work item
> which already finished, both of which are quite unlikely given that we
> haven't had any simliar report on any other work items.
>
> I think it'd be wise to extend sysrq-t output to include the states of
> workqueue if for nothing else to easily rule out doubts about basic wq
> functions. Dave, is this as much information we're gonna get from the
> trinity instance? I assume trying to reproduce the case isn't likely
> to work?

I tried enabling the function tracer, and ended up locking up the box entirely,
so had to reboot. Rerunning it now on rc6, will let you know if it reproduces
(though it took like a day or so last time).

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/