Re: [00/41] Large Blocksize Support V7 (adds memmap support)

From: Nick Piggin
Date: Tue Sep 18 2007 - 13:30:05 EST


On Tuesday 18 September 2007 08:05, Christoph Lameter wrote:
> On Sun, 16 Sep 2007, Nick Piggin wrote:
> > > > fsblock doesn't need any of those hacks, of course.
> > >
> > > Nor does mine for the low orders that we are considering. For order >
> > > MAX_ORDER this is unavoidable since the page allocator cannot manage
> > > such large pages. It can be used for lower order if there are issues
> > > (that I have not seen yet).
> >
> > Or we can just avoid all doubt (and doesn't have arbitrary limitations
> > according to what you think might be reasonable or how well the
> > system actually behaves).
>
> We can avoid all doubt in this patchset as well by adding support for
> fallback to a vmalloced compound page.

How would you do a vmapped fallback in your patchset? How would
you keep track of pages 2..N if they don't exist in the radix tree?
What if they don't even exist in the kernel's linear mapping? It seems
you would also require more special casing in the fault path and special
casing in the block layer to do this.

It's not a trivial problem you can just brush away by handwaving. Let's
see... you could add another field in struct page to store the vmap
virtual address, and set a new flag to indicate indicate that constituent
page N can be found via vmalloc_to_page(page->vaddr + N*PAGE_SIZE).
Then add more special casing to the block layer and fault path etc. to handle
these new non-contiguous compound pages. I guess you must have thought
about it much harder than the 2 minutes I just did then, so you must have a
much nicer solution...

But even so, you're still trying very hard to avoid touching the filesystems
or buffer layer while advocating instead to squeeze the complexity out into
the vm and block layer. I don't agree that is the right thing to do. Sure it
is _easier_, because we know the VM.

I don't argue that fsblock large block support is trivial. But you are first
asserting that it is too complicated and then trying to address one of the
issues it solves by introducing complexity elsewhere.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/