Re: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and `mem_cgroup_shrink_node`

From: Paul Menzel
Date: Wed Nov 30 2016 - 07:31:50 EST


On 11/30/16 12:54, Paul E. McKenney wrote:
> On Wed, Nov 30, 2016 at 03:53:20AM -0800, Paul E. McKenney wrote:
>> On Wed, Nov 30, 2016 at 12:09:44PM +0100, Michal Hocko wrote:
>>> [CCing Paul]
>>>
>>> On Wed 30-11-16 11:28:34, Donald Buczek wrote:
>>> [...]
>>>> shrink_active_list gets and releases the spinlock and calls cond_resched().
>>>> This should give other tasks a chance to run. Just as an experiment, I'm
>>>> trying
>>>>
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -1921,7 +1921,7 @@ static void shrink_active_list(unsigned long
>>>> nr_to_scan,
>>>> spin_unlock_irq(&pgdat->lru_lock);
>>>>
>>>> while (!list_empty(&l_hold)) {
>>>> - cond_resched();
>>>> + cond_resched_rcu_qs();
>>>> page = lru_to_page(&l_hold);
>>>> list_del(&page->lru);
>>>>
>>>> and didn't hit a rcu_sched warning for >21 hours uptime now. We'll see.
>>>
>>> This is really interesting! Is it possible that the RCU stall detector
>>> is somehow confused?
>>
>> No, it is not confused. Again, cond_resched() is not a quiescent
>> state unless it does a context switch. Therefore, if the task running
>> in that loop was the only runnable task on its CPU, cond_resched()
>> would -never- provide RCU with a quiescent state.
>>
>> In contrast, cond_resched_rcu_qs() unconditionally provides RCU
>> with a quiescent state (hence the _rcu_qs in its name), regardless
>> of whether or not a context switch happens.
>>
>> It is therefore expected behavior that this change might prevent
>> RCU CPU stall warnings.
>
> I should add... This assumes that CONFIG_PREEMPT=n. So what is
> CONFIG_PREEMPT?

It’s not selected.

```
# CONFIG_PREEMPT is not set
```

>>>> Is preemption disabled for another reason?
>>>
>>> I do not think so. I will have to double check the code but this is a
>>> standard sleepable context. Just wondering what is the PREEMPT
>>> configuration here?


Kind regards,

Paul