Re: xfs: list corruption in xfs_setup_inode()

From: Christoph Hellwig
Date: Wed Nov 01 2017 - 11:01:44 EST


On Wed, Nov 01, 2017 at 04:07:01PM +1100, Dave Chinner wrote:
> > We are trying to make kdump working, but even if kdump works
> > we still can't turn on panic_on_warn since this is production
> > machine.
>
> Hmmm. Ok, maybe you could leave a trace of the xfs_iget* trace
> points running and check the log tail for unusual events around the
> time of the next crash. e.g. xfs_iget_reclaim_fail events. That
> might point us to a potential interaction we can look at more
> closely. I'd also suggest slab poisoning as well, as that will
> catch other lifecycle problems that could be causing list
> corruptions such as use-after-free.

KASAN has also been really useful for these kinds of issues.