Re: [PATCH v2] padata: validate cpumask without removed CPU during offline

From: Daniel Jordan
Date: Thu Aug 22 2019 - 18:53:26 EST




On 8/22/19 6:10 PM, Daniel Jordan wrote:
> On 8/21/19 11:50 PM, Herbert Xu wrote:
>> On Fri, Aug 09, 2019 at 05:06:03PM -0400, Daniel Jordan wrote:
>>> diff --git a/kernel/padata.c b/kernel/padata.c
>>> index d056276a96ce..01460ea1d160 100644
>>> --- a/kernel/padata.c
>>> +++ b/kernel/padata.c
>>> @@ -702,10 +702,7 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>> ÂÂÂÂÂ struct parallel_data *pd = NULL;
>>> ÂÂÂÂÂ if (cpumask_test_cpu(cpu, cpu_online_mask)) {
>>> -
>>> -ÂÂÂÂÂÂÂ if (!padata_validate_cpumask(pinst, pinst->cpumask.pcpu) ||
>>> -ÂÂÂÂÂÂÂÂÂÂÂ !padata_validate_cpumask(pinst, pinst->cpumask.cbcpu))
>>> -ÂÂÂÂÂÂÂÂÂÂÂ __padata_stop(pinst);
>>> +ÂÂÂÂÂÂÂ __padata_stop(pinst);
>>> ÂÂÂÂÂÂÂÂÂ pd = padata_alloc_pd(pinst, pinst->cpumask.pcpu,
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pinst->cpumask.cbcpu);
>>> @@ -716,6 +713,9 @@ static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
>>> ÂÂÂÂÂÂÂÂÂ cpumask_clear_cpu(cpu, pd->cpumask.cbcpu);
>>> ÂÂÂÂÂÂÂÂÂ cpumask_clear_cpu(cpu, pd->cpumask.pcpu);
>>> +ÂÂÂÂÂÂÂ if (padata_validate_cpumask(pinst, pd->cpumask.pcpu) &&
>>> +ÂÂÂÂÂÂÂÂÂÂÂ padata_validate_cpumask(pinst, pd->cpumask.cbcpu))
>>> +ÂÂÂÂÂÂÂÂÂÂÂ __padata_start(pinst);
>>> ÂÂÂÂÂ }
>>
>> I looked back at the original code and in fact the original
>> assumption is to call this after cpu_online_mask has been modified.
>>
>> So I suspect we need to change the state at which this is called
>> by CPU hotplug.
>
> Yes the state idea is good, it's cleaner to have the CPU out of the online mask ahead of time.
>
> I think we'll need two states. We want a CPU being offlined to already be removed from the online cpumask so and'ing the user-supplied and online masks reflects conditions after the hotplug operation is finished. For the same reason we want a CPU being onlined to already be in the online mask, and we can use the existing hotplug state for that, though we'd need a new padata-specific state for the offline case.

The new state would be something before CPUHP_BRINGUP_CPU so the cpu isn't in the online mask yet.

>
>> IOW the commit that broke this is 30e92153b4e6.
>
> I don't think 30e92153b4e6 is the one since the commit before that only allows __padata_remove_cpu to do its work if @cpu is in the online mask, so the call happens before cpu_online_mask has been modified. Same story for the very first padata commit, so it seems like that should actually be Fixes.
>
>> This would also allow us to get rid of the two cpumask_clear_cpu
>> calls on pd->cpumask which is just bogus as you should only ever
>> modify the pd->cpumask prior to the padata_repalce call (because
>> the readers are not serialised with respect to this).
>
> Yeah, makes sense.
>
> Daniel