Re: [git patches] xfs and block fixes for virtually indexed arches

From: FUJITA Tomonori
Date: Mon Dec 21 2009 - 03:53:55 EST


On Fri, 18 Dec 2009 09:17:32 -0500
tytso@xxxxxxx wrote:

> On Fri, Dec 18, 2009 at 09:21:30AM +0900, FUJITA Tomonori wrote:
> >
> > iSCSI initiator driver should work with kmalloc'ed memory.
> >
> > The reason why iSCSI didn't work with kmalloc'ed memory is that it
> > uses sendpage (which needs refcountable pages). We added a workaround
> > to not use sendpage with kmalloc'ed memory (it would be great if we
> > remove the workaround though).
>
> Well, with a patch that I plan to be pushing that we have general
> agreement that it is a block device driver BUG not to accept
> kmalloc'ed/SLAB allocated memory, is one where ext4 will use
> kmalloc'ed/slab allocated memory on occasion when it needs to make
> shadow copy of buffers for journalling purposes AND when the fs block
> size is smaller than the page size. (i.e., no more allocating a 16k
> page when the fs block size is 4k). So this won't happen all the
> time; even if the case of a 16k Itanium system with 4k blocks, the
> bulk of the data won't be sent via kmalloc'ed memory --- just some
> critical metadata block and some data blocks that need to be escaped
> when being written into the journal.

Actually, ext3 (jbd) sent kmalloc'ed buffer to the block layer for
frozen data. xfs also used kmalloc'ed buffer. Neither do now (so, as
you said above, jbd wastes some memory when the block size is not
equal to page size, I think).


> I do think we need to document that block device drivers are
> _expected_ to be able to handle kmalloc'ed memory,

Agreed.

Note that network block drivers (iSCSI, drbd, something else?) doesn't
play with page ref-counting. They want to use sendpage. The network
layer (sendpage) can't handle non-ref-counted pages. The best solution
for fs and network block drivers might be modifying sendpage to handle
such pages.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/