Re: [PATCH RFC] rcu/tree: Use GFP_MEMALLOC for alloc memory to free memory pattern

From: joel
Date: Wed Apr 01 2020 - 07:14:23 EST




On April 1, 2020 3:23:59 AM EDT, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>On Tue 31-03-20 12:01:17, Joel Fernandes wrote:
>> On Tue, Mar 31, 2020 at 05:34:50PM +0200, Michal Hocko wrote:
>> > On Tue 31-03-20 10:58:06, Joel Fernandes wrote:
>> > [...]
>> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>> > > > index 4be763355c9fb..965deefffdd58 100644
>> > > > --- a/kernel/rcu/tree.c
>> > > > +++ b/kernel/rcu/tree.c
>> > > > @@ -3149,7 +3149,7 @@ static inline struct rcu_head
>*attach_rcu_head_to_object(void *obj)
>> > > >
>> > > > if (!ptr)
>> > > > ptr = kmalloc(sizeof(unsigned long *) +
>> > > > - sizeof(struct rcu_head), GFP_ATOMIC | __GFP_NOWARN);
>> > > > + sizeof(struct rcu_head), GFP_MEMALLOC);
>> > >
>> > > Just to add, the main requirements here are:
>> > > 1. Allocation should be bounded in time.
>> > > 2. Allocation should try hard (possibly tapping into reserves)
>> > > 3. Sleeping is Ok but should not affect the time bound.
>> >
>> >
>> > __GFP_ATOMIC | __GFP_HIGH is the way to get an additional access to
>> > memory reserves regarless of the sleeping status.
>> >
>> > Using __GFP_MEMALLOC is quite dangerous because it can deplete
>_all_ the
>> > memory. What does prevent the above code path to do that?
>
>Neil has provided a nice explanation down the email thread. But let me
>clarify few things here.
>
>> Can you suggest what prevents other users of GFP_MEMALLOC from doing
>that
>> also?
>
>There is no explicit mechanism which is indeed unfortunate. The only
>user real user of the flag is Swap over NFS AFAIK. I have never dared
>to
>look into details on how the complete reserves depletion is prevented.
>Mel would be much better fit here.
>
>> That's the whole point of having a reserve, in normal usage no one
>will
>> use it, but some times you need to use it. Keep in mind this is not a
>common
>> case in this code here, this is triggered only if earlier allocation
>attempts
>> failed. Only *then* we try with GFP_MEMALLOC with promises to free
>additional
>> memory soon.
>
>You are right that this is the usecase for the flag. But this should be
>done with an extreme care because the core MM relies on those reserves
>so any other users should better make sure they do not consume a lot
>from reserves as well.
>

Understood and agreed.

>> > If a partial access to reserves is sufficient then why the existing
>> > modifiers (mentioned above are not sufficient?
>>
>> The point with using GFP_MEMALLOC is it is useful for situations
>where you
>> are about to free memory and needed some memory temporarily, to free
>that. It
>> depletes it a bit temporarily to free even more. Is that not the
>point of
>> PF_MEMALLOC?
>> * %__GFP_MEMALLOC allows access to all memory. This should only be
>used when
>> * the caller guarantees the allocation will allow more memory to be
>freed
>> * very shortly e.g. process exiting or swapping. Users either should
>> * be the MM or co-ordinating closely with the VM (e.g. swap over
>NFS).
>>
>> I was just recommending usage of this flag here because it fits the
>> requirement of allocating some memory to free some memory. I am also
>Ok with
>> GFP_ATOMIC with the GFP_NOWARN removed, if you are Ok with that.
>
>Maybe we need to refine this documentation to be more explicit about an
>extreme care to be taken when using the flag.
>
>diff --git a/include/linux/gfp.h b/include/linux/gfp.h
>index e5b817cb86e7..e436a7e28392 100644
>--- a/include/linux/gfp.h
>+++ b/include/linux/gfp.h
>@@ -110,6 +110,9 @@ struct vm_area_struct;
>* the caller guarantees the allocation will allow more memory to be
>freed
> * very shortly e.g. process exiting or swapping. Users either should
> * be the MM or co-ordinating closely with the VM (e.g. swap over NFS).
>+ * Users of this flag have to be extremely careful to not deplete the
>reserve
>+ * completely and implement a throttling mechanism which controls the
>consumption
>+ * based on the amount of freed memory.
> *
>* %__GFP_NOMEMALLOC is used to explicitly forbid access to emergency
>reserves.
> * This takes precedence over the %__GFP_MEMALLOC flag if both are set.

I am in support of this documentation patch. I would say "consumption of the reserve".

Thanks,

- Joel

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.