Re: [PATCH] writeback: Fix broken sync writeback

From: Linus Torvalds
Date: Tue Feb 16 2010 - 19:03:25 EST




On Tue, 16 Feb 2010, Linus Torvalds wrote:
>
> For example, it might be that the logic in writeback_inodes_wb() moves an
> inode back (the "redirty_tail()" cases) in bad ways when it shouldn't.

Or another example: perhaps we screw up the inode list ordering when we
move the inodes between b_dirty <-> b_io <-> b_more_io?

And if we put inodes on the b_more_io list, do we perhaps then end up
doing too much waiting in inode_wait_for_writeback()?

See what I'm saying? Your patch - by just submitting the maximal sized
buffers - may well end up hiding the real problem. But if there is a real
problem in that whole list manipulation or waiting, then that problem
still exists for async writeback.

Wouldn't it be better to fix the real problem, so that async writeback
also gets the correct IO patterns?

NOTE! It's entirely possible that we do end up wanting to really submit
the maximal dirty IO for synchronous dirty writeback, in order to get
better IO patterns. So maybe your patch ends up being the right one in the
end. But I really _really_ want to understand this.

But right now, that patch seems like voodoo programming to me, and I
personally suspect that the real problem is in the horribly complex
b_io/b_more_io interaction. Or one of the _other_ horribly complex details
in the write-back logic (even just writing back a single inode is
complicated, see all the logic about "range_start/range_end" in the lower
level write_cache_pages() function).

Our whole writeback code is very very complicated. I don't like it. But
that's also why I want to feel like I understand the patch when I apply
it.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/