Re: [RFC Patch V1] ioatdma: Ignore IOAT devices under hotplug-capable PCI host bridge

From: Dan Williams
Date: Mon Jun 08 2015 - 12:23:10 EST


On Mon, Jun 8, 2015 at 8:48 AM, Vinod Koul <vinod.koul@xxxxxxxxx> wrote:
> On Mon, Jun 08, 2015 at 07:44:43PM +0800, Jiang Liu wrote:
>> On 2015/6/8 18:42, Vinod Koul wrote:
>> > On Tue, Jun 02, 2015 at 02:37:31PM +0800, Jiang Liu wrote:
>> >> Ccing Rafael, it's ACPI hotplug related.
>> >>
>> >> On 2015/6/2 14:36, Jiang Liu wrote:
>> >>> The dmaengine core assumes that async DMA devices will only be removed
>> >>> when they not used anymore, or it assumes dma_async_device_unregister()
>> >>> will only be called by dma driver exit routines. But this assumption is
>> >>> not true for the IOAT driver, which calls dma_async_device_unregister()
>> >>> from ioat_remove(). So current IOAT driver doesn't support device
>> >>> hot-removal because it may cause system crash to hot-remove an inuse
>> >>> IOAT device.
>> >>>
>> >>> To support CPU socket hot-removal, all PCI devices, including IOAT
>> >>> devices embedded in the socket, will be hot-removed. The idea solution
>> >>> is to enhance the dmaengine core and IOAT driver to support hot-removal,
>> >>> but that's too hard.
>> >>>
>> >>> This patch implements a hack to disable IOAT devices under hotplug-capable
>> >>> CPU socket so it won't break socket hot-removal.
>> >>>
>> > So below looks okay though I wonder how hard would it be to fix hot unplug ?
>> Hi Vinod,
>> Thanks for review. About three years ago I worked out a
>> patch set to enhance the dmaengine core and ioat device driver to
>> support hot-removal. But it has been rejected due to concerns about
>> performance penalty caused by usage tracking.
>> To support hot-removal, we need to track dma channel usage
>> and a way to reclaim dma channels when hot-removing. This may cause
>> sensible performance penalty. Recently I have tried again but still
>> haven't find a way to support hot-removal. So eventually I suggest
>> to disable IOAT device on hot-plug capable systems.
>
> Or on a different mechanism, take the module reference on the channel
> allocation and release it one channel release.
>
> That way we don't need to count and we ensure dmaengine module is removed
> only when users have stopped using the device...

This was one of the first "features" of dmaengine I deleted. There's
no clean / reliable way to support general purpose dma-offload and
time bounded hot-removal. Multiple clients may be using a channel in
varied contexts so you both need to tell them to stop and wait for
them to acknowledge. On platforms with socket hotplug I would expect
the cpu to almost always be faster than an ioatdma offload. So, fwiw,
I think hotplug capability is more useful to the platform than ioatdma
offload.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/