Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

From: Andy Isaacson
Date: Wed May 04 2005 - 20:58:27 EST


On Wed, May 04, 2005 at 09:27:21PM -0400, Rik van Riel wrote:
> On Wed, 4 May 2005, William Jordan wrote:
> > On 5/3/05, Andy Isaacson <adi@xxxxxxxxxxxxx> wrote:
> > > Rather than replacing the fully-registered pages with pages of zeros,
> > > you could simply unmap them.
> >
> > I don't like this option. It is nearly free to map all of the pages to
> > the zero-page. You never have to allocate a page if the user never
> > writes to it.
>
> Unmapping should work fine, as long as the VMA flags are
> set appropriately. The page fault handler can take care
> of the rest...

I think there may be a difference in terminology here. What I
originally proposed (and what I think Bill was reacting to) is the
equivalent of sys_munmap() on the range of registered pages. That has
the downsides that he mentioned; an address that was valid in the parent
will now result in SIGSEGV or SIGBUS in the child, and it's explicitly
endorsed by the userland APIs (such as MPI2) that it's valid to register
stack addresses (for example).

What I think you're proposing, Rik, is that VMA get destroyed (or split,
if only part of it had been registered) and replaced with an anonymous
one. That's a very low-overhead way of going about it, I think. Then
as you say, the page fault handler will automatically give a zero page
to the process when it faults on those addresses.

Did I understand your suggestion correctly? I think I agree with
Bill that having the child fault on pages which happened to have been
registered by the parent would be a bad thing.

This would, if I understand correctly, be visible in /proc/$$/maps.
Which is OK, if a little bit suprising; but the alternatives are worse.

-andy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/