Re: [PATCH v7 4/6] rcu: Add RCU stall diagnosis information

From: Leizhen (ThunderTown)
Date: Thu Nov 17 2022 - 08:25:53 EST




On 2022/11/17 20:22, Frederic Weisbecker wrote:
> On Thu, Nov 17, 2022 at 09:57:18AM +0800, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/11/17 6:39, Frederic Weisbecker wrote:
>>> On Fri, Nov 11, 2022 at 09:07:07PM +0800, Zhen Lei wrote:
>>>> @@ -262,6 +279,8 @@ struct rcu_data {
>>>> short rcu_onl_gp_flags; /* ->gp_flags at last online. */
>>>> unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
>>>> unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */
>>>> + struct rcu_snap_record snap_record; /* Snapshot of core stats at half of */
>>>> + /* the first RCU stall timeout */
>>>
>>> This should be under #ifdef CONFIG_RCU_CPU_STALL_CPUTIME
>>
>> This will not work for now because we also support boot option
>> rcupdate.rcu_cpu_stall_cputime.
>
> I'm confused. If CONFIG_RCU_CPU_STALL_CPUTIME=n then rcupdate.rcu_cpu_stall_cputime has
> no effect, right?

No, rcupdate.rcu_cpu_stall_cputime override CONFIG_RCU_CPU_STALL_CPUTIME. Because
the default value of CONFIG_RCU_CPU_STALL_CPUTIME is n, so in most cases, we need
rcupdate.rcu_cpu_stall_cputime as the escape route.

If CONFIG_RCU_CPU_STALL_CPUTIME=y is default, your suggestion is more appropriate.

>
> Thanks.
>
>>
>>>
>>>> +static void print_cpu_stat_info(int cpu)
>>>> +{
>>>> + struct rcu_snap_record rsr, *rsrp;
>>>> + struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
>>>> + struct kernel_cpustat *kcsp = &kcpustat_cpu(cpu);
>>>> +
>>>> + if (!rcu_cpu_stall_cputime)
>>>> + return;
>>>> +
>>>> + rsrp = &rdp->snap_record;
>>>> + if (rsrp->gp_seq != rdp->gp_seq)
>>>> + return;
>>>> +
>>>> + rsr.cputime_irq = kcpustat_field(kcsp, CPUTIME_IRQ, cpu);
>>>> + rsr.cputime_softirq = kcpustat_field(kcsp, CPUTIME_SOFTIRQ, cpu);
>>>> + rsr.cputime_system = kcpustat_field(kcsp, CPUTIME_SYSTEM, cpu);
>>>> +
>>>> + pr_err("\t hardirqs softirqs csw/system\n");
>>>> + pr_err("\t number: %8ld %10d %12lld\n",
>>>> + kstat_cpu_irqs_sum(cpu) - rsrp->nr_hardirqs,
>>>> + kstat_cpu_softirqs_sum(cpu) - rsrp->nr_softirqs,
>>>> + nr_context_switches_cpu(cpu) - rsrp->nr_csw);
>>>> + pr_err("\tcputime: %8lld %10lld %12lld ==> %lld(ms)\n",
>>>> + div_u64(rsr.cputime_irq - rsrp->cputime_irq, NSEC_PER_MSEC),
>>>> + div_u64(rsr.cputime_softirq - rsrp->cputime_softirq, NSEC_PER_MSEC),
>>>> + div_u64(rsr.cputime_system - rsrp->cputime_system, NSEC_PER_MSEC),
>>>> + jiffies64_to_msecs(jiffies - rsrp->jiffies));
>>>
>>> jiffies_to_msecs() should be enough.
>>
>> OK, thanks.
>>
>>>
>>> Thanks.
>>>
>>> .
>>>
>>
>> --
>> Regards,
>> Zhen Lei
> .
>

--
Regards,
Zhen Lei