Re: [PATCH v4 3/4] drm/vc4: Check for the binner bo before handling OOM interrupt

From: Paul Kocialkowski
Date: Thu Apr 04 2019 - 10:33:29 EST


Hey,

Le mercredi 03 avril 2019 Ã 11:58 -0700, Eric Anholt a Ãcrit :
> Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx> writes:
>
> > Since the OOM interrupt directly deals with the binner bo, it doesn't
> > make sense to try and handle it without a binner buffer registered.
> > The interrupt will kick again in due time, so we can safely ignore it
> > without a binner bo allocated.
> >
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@xxxxxxxxxxx>
> > ---
> > drivers/gpu/drm/vc4/vc4_irq.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
> > index ffd0a4388752..723dc86b4511 100644
> > --- a/drivers/gpu/drm/vc4/vc4_irq.c
> > +++ b/drivers/gpu/drm/vc4/vc4_irq.c
> > @@ -64,6 +64,9 @@ vc4_overflow_mem_work(struct work_struct *work)
> > struct vc4_exec_info *exec;
> > unsigned long irqflags;
>
> Since OOM handling is tricky, could we add a comment to help the next
> person try to understand it:
>
> /* The OOM IRQ is level-triggered, so we'll see one at power-on before
> * any jobs are submitted. The OOM IRQ is masked when this work is
> * scheduled, so we can safely return if there's no binner memory
> * (because no client is currently using 3D). When a bin job is
> * later submitted, its tile memory allocation will end up bringing us
> * back to a non-OOM state so the OOM can be triggered again.
> */
>
> But, actually, I don't see how the OOM IRQ will ever get re-enabled.

Okay so I investigated that to try and understand what's going on.
We are definitely writing the OUTOMEM bit to V3D_INTDIS just before
scheduling the workqueue, and never re-enable the IRQ when leaving
early in the workqueue because !vc4->bin_bo.

It turns out that what saves us here is vc4_irq_postinstall being
called from runtime resume at "the right time". Obviously this is more
than fragile, so we should really be re-enabling the IRQ as soon as we
have the binner bo allocated.

Since we're now allocating at the first non-dumb bo alloc, I think we
need to make sure that we did in fact get the irq and registered the
allocated BO with the workqueue before submitting the rcl. Or does the
hardware provide any mechanism to take that off our hands somehow?

What do you think?

Cheers,

Paul

--
Paul Kocialkowski, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com