Re: [git patches] xfs and block fixes for virtually indexed arches

From: tytso
Date: Thu Dec 17 2009 - 11:31:09 EST


On Thu, Dec 17, 2009 at 08:16:12AM -0800, Linus Torvalds wrote:
>
> I hate them.
>
> I don't see what the point of allowing kernel virtual addresses in bio's
> is. It's wrong. The fact that XFS does that sh*t is an XFS issue. Handle
> it there.
>
> Fix XFS. Or convince me with some really good arguments, and make sure
> that Jens signs off on the cr*p too.

I have a somewhat similar issue that comes up for ext4; at the moment
occasionaly we need to clone a buffer head buffer; either because the
first four bytes match the magic JBD "escape sequence", and we need to
modify the block and escape it before writing it to the journal, or
because we need to make a copy of a allocation bitmap block so we can
write a fixed copy to disk while we modify the "real" block during a
commit. At the moment we allocate a full page, even if that means
allocating a 16k PPC page when the file system block size is 4k, or
allocating a 4k x86 page when the file system block size is 1k.

That's because apparently the iSCSI and DMA blocks assume that they
have Real Pages (tm) passed to block I/O requests, and apparently XFS
ran into problems when sending vmalloc'ed pages. I don't know if this
is a problem if we pass the bio layer addresses coming from the SLAB
allocator, but oral tradition seems to indicate this is problematic,
although no one has given me the full chapter and verse explanation
about why this is so.

Now that I see Linus's complaint, I'm wondering if the issue is really
about kernel virtual addresses (i.e., coming from vmalloc), and not a
requirement for Real Pages (i.e., coming from the SLAB allocator as
opposed to get_free_page). And can this be documented someplace? I
tried looking at the bio documentation, and couldn't find anything
definitive on the subject.

Thanks,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/