Re: [PATCH] xfs: don't do inodgc work if task is exiting

From: Dave Chinner
Date: Fri May 12 2023 - 18:01:47 EST


On Fri, May 12, 2023 at 09:16:36AM -0600, Tycho Andersen wrote:
> On Fri, May 12, 2023 at 11:45:47AM +1000, Dave Chinner wrote:
> >
> > Yeah, this is papering over the observed symptom, not addressing the
> > root cause of the inodegc flush delay. What do you see when you run
> > sysrq-w and sysrq-l? Are there inodegc worker threads blocked
> > performing inodegc?
>
> I will try this next time we encounter this.
>
> > e.g. inodegc flushes could simply be delayed by an unlinked inode
> > being processed that has millions of extents that need to be freed.
> >
> > In reality, inode reclaim can block for long periods of time
> > on any filesystem, so the concept of "inode reclaim should
> > not block when PF_EXITING" is not a behaviour that we guarantee
> > anywhere or could guarantee across the board.
> >
> > Let's get to the bottom of why inodegc has apparently stalled before
> > trying to work out how to fix it...
>
> I'm happy to try, but I think it is also worth applying this patch.
> Like I said in the other thread, having to evac a box to get rid of an
> unkillable userspace process is annoying.

If inodegc is stuck, then it's only a matter of time before the
filesystem will completely lock up and you'll have to cycle the
machine anyway. This patch merely kicks the can down the road a few
minutes, it doesn't change anything material.

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx