Re: edac_core: crashes on shutdown

From: Florian Mickler
Date: Thu Dec 02 2010 - 14:55:04 EST


On Thu, 2 Dec 2010 19:51:23 +0100
Borislav Petkov <bp@xxxxxxxxx> wrote:

> On Thu, Dec 02, 2010 at 01:14:12PM -0500, Florian Mickler wrote:
> > Yes. That should work. Once we stopped the workqueue and removed it
> > from the global list, do we actually need to set it to OP_OFFLINE?
>
> I think yes, because we seem to protect ourselves in the actual
> edac_mc_workq_function() on exit, if we overlap the work items
> cancellation with the execution of the delayed work at the same time on
> a different cpu. Besides, it is a single assignment and it does cost us
> almost nothing.

true. I wonder if the flush workqueue waits for the work-function
to finish?

>
> > Also 00740c585 did fix a hang in edac_mc.c... could this also happen
> > in the edac_device_del_device/edac_pci_del_device functions?
>
> Nope, because there we don't check ->op_state when we cancel the work
> items in the respective _teardown() functions - we simply cancel them
> unconditionally.
>

But shouldn't we check ->op_state for those as well? Why don't we hang
for those functions in similar cases as your original patch fixed?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/