Re: [PATCH] sched/tracing: correct the task blocking state

From: Alex Shi
Date: Tue Jan 02 2024 - 22:15:19 EST


On Wed, Jan 3, 2024 at 12:16 AM Valentin Schneider <vschneid@xxxxxxxxxx> wrote:
>
> On 02/01/24 21:00, Alex Shi wrote:
> > On Tue, Jan 2, 2024 at 6:19 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote:
> >>
> >> On 02/01/24 15:33, alexs@xxxxxxxxxx wrote:
> >> > From: Alex Shi <alexs@xxxxxxxxxx>
> >> >
> >> > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE")
> >> > stopped the idle kthreads contribution to loadavg. Also task idle should
> >> > separated from blocked state too, otherwise we will get incorrect task
> >> > blocking state from event tracing sched:sched_stat_blocked.
> >> >
> >>
> >> Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked'
> >> meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to
> >> loadavg yes, but they are still in an UNINTERRUPTIBLE wait.
> >
> >
> > Hi Valentin,
> > Thanks a lot for the reply.
> >
> > I agree with you the current usage, but if so, we account for the idle task into
> > blocked state. And it's better to distinguish between idle and block.
> >
>
> Why is that an issue? If those tasks didn't have to be
> TASK_UNINTERRUPTIBLE (via TASK_IDLE), we'd make them TASK_INTERRUPTIBLE and
> they'd also inflate the 'sleeping' schedstat (rather than the 'blocked').
>
> What problem are you facing with those tasks being flagged as blocked during
> their wait?
>

Uh, Tencent cloud has some latency sensitive services, a blocked state
means the service has
some trouble, but with IDLE state involved, it's failed on this judgement.
and 2nd, if a service has abnormal, we want to check if it's hanging
on io or sth else, but the top
3 D tasks are often queuework in our system, and even a task in
blocked state we have no
quick way to figure out if it's IDLE or BLOCKED. 2 different states
will help us a lot.

Thanks!