Re: [PATCH] debugobject: don't wake up kswapd from fill_pool()

From: Tetsuo Handa
Date: Fri May 12 2023 - 19:15:23 EST


On 2023/05/13 3:07, Thomas Gleixner wrote:
> On Fri, May 12 2023 at 22:09, Tetsuo Handa wrote:
>> On 2023/05/12 21:54, Thomas Gleixner wrote:
>>> On Fri, May 12 2023 at 19:57, Tetsuo Handa wrote:
>>>> On 2023/05/12 12:44, Andrew Morton wrote:
>>>>> On Thu, 11 May 2023 22:47:32 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>>> syzbot is reporting lockdep warning in fill_pool(), for GFP_ATOMIC is
>>>>>> (__GFP_HIGH | __GFP_KSWAPD_RECLAIM) which wakes up kswapd.
>>>>>> Since fill_pool() might be called with arbitrary locks held,
>>>>>> fill_pool() should not assume that holding pgdat->kswapd_wait is safe.
>>>
>>> https://lore.kernel.org/lkml/871qjldbes.ffs@tglx/
>>
>> .config says IS_ENABLED(CONFIG_PREEMPT_RT) == false, and lockdep says about
>> base->lock => pgdat->kswapd_wait => p->pi_lock => rq->__lock => base->lock
>> dependency but does not say about db->lock.
>>
>> How can your patch fix this problem?
>
> It's described in the changelog, no?

I can't find a proof that lookup_object() never returns NULL
when debug_object_activate() is called.

>
> The main change is to make the refill invocation conditional when the
> lookup fails. That's how that code has been from day one.

Making refill conditional helps reducing frequency of doing allocations.
I want a proof that allocations never happens in the worst scenario.

Are you saying that some debugobject function other than debug_object_activate()
guarantees that memory for that object was already allocated before
debug_object_activate() is called for the first time for that object,
_and_ such debugobject function is called without locks held?

>
> The patch which closed the race recently wreckaged those refill
> oportunities and the fix for that introduced this problem.
>
> Thanks,
>
> tglx