Re: [PATCH v3 6/6] freezer,sched: Rewrite core freezer logic

From: Peter Zijlstra
Date: Mon Sep 26 2022 - 10:01:26 EST


On Mon, Sep 26, 2022 at 12:55:21PM +0200, Christian Borntraeger wrote:
>
>
> Am 26.09.22 um 10:06 schrieb Christian Borntraeger:
> >
> >
> > Am 23.09.22 um 09:53 schrieb Christian Borntraeger:
> > > Am 23.09.22 um 09:21 schrieb Christian Borntraeger:
> > > > Peter,
> > > >
> > > > as a heads-up. This commit (bisected and verified) triggers a
> > > > regression in our KVM on s390x CI. The symptom is that a specific
> > > > testcase (start a guest with next kernel and a poky ramdisk,
> > > > then ssh via vsock into the guest and run the reboot command) now
> > > > takes much longer (300 instead of 20 seconds). From a first look
> > > > it seems that the sshd takes very long to end during shutdown
> > > > but I have not looked into that yet.
> > > > Any quick idea?
> > > >
> > > > Christian
> > >
> > > the sshd seems to hang in virtio-serial (not vsock).
> >
> > FWIW, sshd does not seem to hang, instead it seems to busy loop in
> > wait_port_writable calling into the scheduler over and over again.
>
> -#define TASK_FREEZABLE 0x00002000
> +#define TASK_FREEZABLE 0x00000000
>
> "Fixes" the issue. Just have to find out which of users is responsible.

Since it's not the wait_port_writable() one -- we already tested that by
virtue of 's/wait_event_freezable/wait_event/' there, it must be on the
producing side of that port. But I'm having a wee bit of trouble
following that code.

Is there a task stuck in FROZEN state? -- then again, I thought you said
there was no actual suspend involved, so that should not be it either.

I'm curious though -- how far does it get into the scheduler? It should
call schedule() with __state == TASK_INTERRUPTIBLE|TASK_FREEZABLE, which
is quite sufficient to get it off the runqueue, who then puts it back?
Or is it bailing early in the wait_event loop?