Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm_releasepage)

From: Dave Chinner
Date: Tue May 31 2016 - 02:08:12 EST


On Tue, May 31, 2016 at 12:59:04PM +0900, Minchan Kim wrote:
> On Tue, May 31, 2016 at 12:55:09PM +1000, Dave Chinner wrote:
> > On Tue, May 31, 2016 at 10:07:24AM +0900, Minchan Kim wrote:
> > > On Tue, May 31, 2016 at 08:36:57AM +1000, Dave Chinner wrote:
> > > > But this is a dirty page, which means it may have delalloc or
> > > > unwritten state on it's buffers, both of which indicate that there
> > > > is dirty data in teh page that hasn't been written. XFS issues a
> > > > warning on this because neither shrink_active_list nor
> > > > try_to_release_page() check for whether the page is dirty or not.
> > > >
> > > > Hence it seems to me that shrink_active_list() is calling
> > > > try_to_release_page() inappropriately, and XFS is just the
> > > > messenger. If you turn laptop mode on, it is likely the problem will
> > > > go away as kswapd will run with .may_writepage = false, but that
> > > > will also cause other behavioural changes relating to writeback and
> > > > memory reclaim. It might be worth trying as a workaround for now.
> > > >
> > > > MM-folk - is this analysis correct? If so, why is
> > > > shrink_active_list() calling try_to_release_page() on dirty pages?
> > > > Is this just an oversight or is there some problem that this is
> > > > trying to work around? It seems trivial to fix to me (add a
> > > > !PageDirty check), but I don't know why the check is there in the
> > > > first place...
> > >
> > > It seems to be latter.
> > > Below commit seems to be related.
> > > [ecdfc9787fe527, Resurrect 'try_to_free_buffers()' VM hackery.]
> >
> > Okay, that's been there a long, long time (2007), and it covers a
> > case where the filesystem cleans pages without the VM knowing about
> > it (i.e. it marks bufferheads clean without clearing the PageDirty
> > state).
> >
> > That does not explain the code in shrink_active_list().
>
> Yeb, My point was the patch removed the PageDirty check in
> try_to_free_buffers.

*nod*

[...]

> And I found a culprit.
> e182d61263b7d5, [PATCH] buffer_head takedown for bighighmem machines

Heh. You have the combined historic tree sitting around for code
archeology, just like I do :)

> It introduced pagevec_strip wich calls try_to_release_page without
> PageDirty check in refill_inactive_zone which is shrink_active_list
> now.

<sigh>

It was merged 2 days before XFS was merged. Merging XFS made the
code Andrew wrote incorrect:

> Quote from
> "
> In refill_inactive(): if the number of buffer_heads is excessive then
> strip buffers from pages as they move onto the inactive list. This
> change is useful for all filesystems. [....]

Except for those that carry state necessary for writeback to be done
correctly on the dirty page bufferheads. At the time, nobody doing
work the mm/writeback code cared about delayed allocation. So we've
carried this behaviour for 14 years without realising that it's
probably the source of all the unexplainable warnings we've got from
XFS over all that time.

I'm half tempted at this point to mostly ignore this mm/ behavour
because we are moving down the path of removing buffer heads from
XFS. That will require us to do different things in ->releasepage
and so just skipping dirty pages in the XFS code is the best thing
to do....

Cheers,

Dave.

--
Dave Chinner
david@xxxxxxxxxxxxx