Re: [PATCH] dm ioctl: Restore __GFP_HIGH in copy_params()

From: Michal Hocko
Date: Mon May 22 2017 - 08:09:54 EST


On Mon 22-05-17 08:00:11, Mikulas Patocka wrote:
>
>
> On Mon, 22 May 2017, Michal Hocko wrote:
>
> > On Fri 19-05-17 19:43:23, Mikulas Patocka wrote:
> > >
> > >
> > > On Fri, 19 May 2017, Michal Hocko wrote:
> > >
> > > > On Thu 18-05-17 19:50:46, Junaid Shahid wrote:
> > > > > (Adding back the correct linux-mm email address and also adding linux-kernel.)
> > > > >
> > > > > On Thursday, May 18, 2017 01:41:33 PM David Rientjes wrote:
> > > > [...]
> > > > > > Let's ask Mikulas, who changed this from PF_MEMALLOC to __GFP_HIGH,
> > > > > > assuming there was a reason to do it in the first place in two different
> > > > > > ways.
> > > >
> > > > Hmm, the old PF_MEMALLOC used to have the following comment
> > > > /*
> > > > * Trying to avoid low memory issues when a device is
> > > > * suspended.
> > > > */
> > > >
> > > > I am not really sure what that means but __GFP_HIGH certainly have a
> > > > different semantic than PF_MEMALLOC. The later grants the full access to
> > > > the memory reserves while the prior on partial access. If this is _really_
> > > > needed then it deserves a comment explaining why.
> > > > --
> > > > Michal Hocko
> > > > SUSE Labs
> > >
> > > Sometimes, I/O to a device mapper device is blocked until the userspace
> > > daemon dmeventd does some action (for example, when dm-mirror leg fails,
> > > dmeventd needs to mark the leg as failed in the lvm metadata and then
> > > reload the device).
> > >
> > > The dmeventd daemon mlocks itself in memory so that it doesn't generate
> > > any I/O. But it must be able to call ioctls. __GFP_HIGH is there so that
> > > the ioctls issued by dmeventd have higher chance of succeeding if some I/O
> > > is blocked, waiting for dmeventd action. It reduces the possibility of
> > > low-memory-deadlock, though it doesn't eliminate it entirely.
> >
> > So what happens if the memory reserves are depleted. Do we deadlock?
>
> Yes, it will deadlock.

That would be more than unfortunate and begs for a different solution.
The thing is that __GFP_HIGH is not propagated to all allocations in the
vmalloc proper. E.g. page table allocations are hardcoded GFP_KERNEL.

> > Why is OOM killer insufficient to allow the further progress?
>
> I don't know if the OOM killer will or won't be triggered in this
> situation, it depends on the people who wrote the OOM killer.

I am not sure I understand. OOM killer is invoked for _all_ allocations
<= PAGE_ALLOC_COSTLY_ORDER that do not have __GFP_NORETRY as long as the
OOM killer is not disabled (oom_killer_disable) and that only happens
from the PM suspend path which makes sure that no userspace is active at
the time. AFAIU this is a userspace triggered path and so the later
shouldn't apply to it and GFP_KERNEL should be therefore sufficient.
Relying to a portion of memory reserves to prevent from deadlock seems
fundamentaly broken to me.

--
Michal Hocko
SUSE Labs