Re: Linux 2.6.29

From: Linus Torvalds
Date: Wed Mar 25 2009 - 20:04:50 EST




On Wed, 25 Mar 2009, Linus Torvalds wrote:
>
> Yes, yes, it may need to allocate backing store (a page that was dirtied
> by mmap), and I'm sure that's the reason for it all,

Hmm. Thinking about that, I'm not so sure. Shouldn't that backing store
allocation happen when the page is actually dirtied on ext3?

I _suspect_ that goes back to the fact that ext3 is older than the
"aops->set_page_dirty()" callback, and nobody taught ext3 to do the bmap's
at dirty time, so now it does it at writeout time.

Anyway, there we are. Old filesystems do the wrong thing (block allocation
while doing writeout because they don't do it when dirtying), and newer
filesystems do the wrong thing (block allocations during writeout, because
they want to do delayed allocation to do the inode dirtying after doing
writeback).

And in either case, the VM is screwed, and can't ask for writeout, because
it will be randomly throttled by the filesystem. So we do lots of async
bdflush threads, which then causes IO ordering problems because now the
writeout is all in random order.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/