Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

From: Thomas Gleixner
Date: Thu Aug 13 2020 - 09:22:06 EST


Uladzislau Rezki <urezki@xxxxxxxxx> writes:
> On Thu, Aug 13, 2020 at 09:50:27AM +0200, Michal Hocko wrote:
>> On Wed 12-08-20 02:13:25, Thomas Gleixner wrote:
>> [...]
>> > I can understand your rationale and what you are trying to solve. So, if
>> > we can actually have a distinct GFP variant:
>> >
>> > GFP_I_ABSOLUTELY_HAVE_TO_DO_THAT_AND_I_KNOW_IT_CAN_FAIL_EARLY
>>
>> Even if we cannot make the zone->lock raw I would prefer to not
>> introduce a new gfp flag. Well we can do an alias for easier grepping
>> #define GFP_RT_SAFE 0

Just using 0 is sneaky but yes, that's fine :)

Bikeshedding: GFP_RT_NOWAIT or such might be more obvious.

>> that would imply nowait semantic and would exclude waking up kswapd as
>> well. If we can make wake up safe under RT then the alias would reflect
>> that without any code changes.

It basically requires to convert the wait queue to something else. Is
the waitqueue strict single waiter?

>> The second, and the more important part, would be to bail out anytime
>> the page allocator is to take a lock which is not allowed in the current
>> RT context. Something like

>> + /*
>> + * Hard atomic contexts are not supported by the allocator for
>> + * anything but pcp requests
>> + */
>> + if (!preemtable())

If you make that preemtible() it might even compile, but that still wont
work because if CONFIG_PREEMPT_COUNT=n then preemptible() is always
false.

So that should be:

if (!preemptible() && gfp == GFP_RT_NOWAIT)

which is limiting the damage to those callers which hand in
GFP_RT_NOWAIT.

lockdep will yell at invocations with gfp != GFP_RT_NOWAIT when it hits
zone->lock in the wrong context. And we want to know about that so we
can look at the caller and figure out how to solve it.

>> > The page allocator allocations should also have a limit on the number of
>> > pages and eventually also page order (need to stare at the code or let
>> > Michal educate me that the order does not matter).
>>
>> In practice anything but order 0 is out of question because we need
>> zone->lock for that currently. Maybe we can introduce pcp lists for
>> higher orders in the future - I have a vague recollection Mel was
>> playing with that some time ago.

Ok.

>> > To make it consistent the same GFP_ variant should allow the slab
>> > allocator go to the point where the slab cache is exhausted.
>> >
>> > Having a distinct and clearly defined GFP_ variant is really key to
>> > chase down offenders and to make reviewers double check upfront why this
>> > is absolutely required.
>>
>> Having a high level and recognizable gfp mask is OK but I would really
>> like not to introduce a dedicated flag. The page allocator should be
>> able to recognize the context which cannot be handled.

The GFP_xxx == 0 is perfectly fine.

> Sorry for jumping in. We can rely on preemptable() for sure, if CONFIG_PREEMPT_RT
> is enabled, something like below:
>
> if (IS_ENABLED_RT && preemptebale())

Ha, you morphed preemtable() into preemptebale() which will not compile
either :)

Thanks,

tglx