Re: [v1 2/2] device-dax: "Hotremove" persistent memory that is used like normal RAM

From: Pavel Tatashin
Date: Sat Apr 20 2019 - 18:04:32 EST


On Sat, Apr 20, 2019 at 5:02 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>
> On Sat, Apr 20, 2019 at 10:02 AM Pavel Tatashin
> <pasha.tatashin@xxxxxxxxxx> wrote:
> >
> > > > Thank you for looking at this. Are you saying, that if drv.remove()
> > > > returns a failure it is simply ignored, and unbind proceeds?
> > >
> > > Yeah, that's the problem. I've looked at making unbind able to fail,
> > > but that can lead to general bad behavior in device-drivers. I.e. why
> > > spend time unwinding allocated resources when the driver can simply
> > > fail unbind? About the best a driver can do is make unbind wait on
> > > some event, but any return results in device-unbind.
> >
> > Hm, just tested, and it is indeed so.
> >
> > I see the following options:
> >
> > 1. Move hot remove code to some other interface, that can fail. Not
> > sure what that would be, but outside of unbind/remove_id. Any
> > suggestion?
> > 2. Option two is don't attept to offline memory in unbind. Do
> > hot-remove memory in unbind if every section is already offlined.
> > Basically, do a walk through memblocks, and if every section is
> > offlined, also do the cleanup.
>
> I think something like option-2 could work just as long as the user is
> ok with failure and prepared to handle it. It's already the case that
> the request_region() in kmem permanently prevents the memory range
> from being reused by any other driver. So if the hot-unplug fails it
> could skip the corresponding release_region() and effectively it's the
> same as what we have now in terms of reuse protection. In your flow if
> the memory remove failed then the conversion attempt from devdax to
> raw mode would also fail and presumably you could fall back to doing a
> full reboot / rebuild of the application state?

With option two, where we will simply check that every memory_block is
offlined, we will have deterministic behavior:

1. If user did not offline every dax memory section beforehand via
echo offline > /sys/devices/system/memory/memoryN/state

echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be the same as now, will simply return, and user won't be able to
use dax afterwords or hotremove it.

2. If user did offline ever dax memory section beforehand
echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
Will be guaranteed to succeed to hotremove the memory, as there is
nothing that can fail.

So, if user wants to hotremove dax memory, he/she must ensure that
every section is offlined before unbinding.

Pasha