Re: [PATCH] Alternate futex non-page-pinning and COW fix

From: Jamie Lokier
Date: Sun Sep 07 2003 - 08:22:21 EST


Rusty Russell wrote:
> In message <20030905205418.GA6019@xxxxxxxxxxxxxxxxxx> you write:
> > Rusty Russell wrote:
> > > Now, if mremap doesn't move the memory, futexes aren't broken, even
> > > without your patch, right? If it does move, you've got a futex
> > > sitting in invalid memory, no surprise if it doesn't work.
> >
> > If the mremap doesn't move the memory it's fine. No surprise :)
> >
> > If it's moved, then the program isn't broken - it knows it just did an
> > mremap, and it sends the wakeup to the new address.
> >
> > This makes sense if async futexes are used on an in-memory private
> > database. But such programs can just use MAP_ANON|MAP_SHARED if they
> > want mremap to work.
>
> The only real case (ignoring the "one thread FUTEX_WAIT while the
> other mremaps underneath" which is gonna break anyway), is FUTEX_FD,

By "async futex" I mean FUTEX_FD; sorry if that wasn't clear.
By "sync futex" I mean FUTEX_WAIT.

> I don't see a problem with having to manually move your futex fds in
> this case when the memory underneath them has been remapped. In
> fact, it'd be surprising if you didn't have to.

I don't see a problem, as long as it is documented. Anybody grokking
the old futex code would expect futexes to move with mappings.

It's a mild surprise whatever behaviour, because:

- if it's a MAP_SHARED, then the program doesn't have to move futexes,
and this is a good thing (think locks in a remap_file_pages
database mapping).

- the old (page pinning) version moves FUTEX_FD futexes "automatically",
in the sense that they're attached to the page which moves.

> Yes. Invalidate is nice because it catches a programmer mistake. But
> why not solve the problem by just holding an mm reference, too?

That would work. An mm isn't that huge once everything's been
unmapped by exit. Alternatively, mm-private futexes can be woken when
the mm is destroyed.

I just implemented the latter, but come to think of it a reference to
a dead mm is light enough not to bother with a list of "futexes
attached to mm to destroy on exit".

So I'll throw that away and provide a patch which just takes a reference.
(Also, takes inode references).

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/