Re: [PATCH 09/13] writeback: requeue_io() on redirtied inode

From: Fengguang Wu
Date: Wed Jan 16 2008 - 23:23:30 EST


On Wed, Jan 16, 2008 at 07:13:07PM +1100, David Chinner wrote:
> On Tue, Jan 15, 2008 at 08:36:46PM +0800, Fengguang Wu wrote:
> > Redirtied inodes could be seen in really fast writes.
> > They should really be synced as soon as possible.
> >
> > redirty_tail() could delay the inode for up to 30s.
> > Kill the delay by using requeue_io() instead.
>
> That's actually bad for anything that does delayed allocation
> or updates state on data I/o completion.
>
> e.g. XFS when writing past EOF doing delalloc dirties the inode
> during writeout (allocation) and then updates the file size on data
> I/o completion hence dirtying the inode again.
>
> With this change, writing the last pages out would result
> in hitting this code and causing the inode to be flushed very
> soon after the data write. Then, after the inode write is issued,
> we get data I/o completion which dirties the inode again,
> resulting in needing to write the inode again to clean it.
> i.e. it introduces a potential new and useless inode write
> I/O.
>
> Also, the immediate inode write may be useless for XFS because the
> inode may be pinned in memory due to async transactions
> still in flight (e.g. from delalloc) so we've got two
> situations where flushing the inode immediately is suboptimal.
>
> Hence I don't think this is an optimisation that should be made
> in the generic writeback code.

Thanks for the explanation.
I can confirm that many requeue_io() happened for the same XFS inode:
[ 158.794562] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[ 158.794827] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14209 global 486 10 0 wc _M tw 1013 sk 0
[ 158.795293] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[ 158.795313] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 486 10 0 wc _M tw 1024 sk 0
...
[ 170.713900] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[ 170.713925] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 1875 0 0 wc _M tw 1024 sk 0
[ 170.813584] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 2855 0 0 wc __ tw 1024 sk 0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/