Re: msync() behaviour broken for MS_ASYNC, revert patch?

From: Linus Torvalds
Date: Fri Feb 10 2006 - 16:08:58 EST




On Sat, 11 Feb 2006, Nick Piggin wrote:
>
> The way I see it, it stems from simply a different expectation of
> MS_ASYNC semantics, rather than exactly what the app is doing.
>
> If there are no data integrity requirements, then the writing should
> be left up to the VM. If there are, then there will be a MS_SYNC,
> which *will* move those hundred megs to the IO layer so there is no
> reason for MS_ASYNC *not* to get it started earlier (and it will
> be more efficient if it does).

Yes, largely.

> The semantics your app wants, in my interpretation, are provided
> by MS_INVALIDATE. Which kind of says "bring mmap data into coherence
> with system cache", which would presumably transfer dirty bits if
> modified (though as an implementation detail, we are never actually
> incoherent as far as the data goes, only dirty bits).

This historical meaning as far as I can tell, for MS_INVALIDATE really
_forgets_ the old mmap'ped contents in a non-coherent system.

Quoting from a UNIX man-page (as found by google):

...

If flags is MS_INVALIDATE, the function synchronizes the
contents of the memory region to match the current file
contents.

o All writes to the mapped portion of the file made
prior to the call are visible by subsequent read
references to the mapped memory region.

o All write references prior to the call, by any pro-
cess, to memory regions mapped to the same portion of
the file using MAP_SHARED, are visible by read refer-
ences to the region.

...

now, it's confusing, but I read that as meaning that the mmap'ed region is
literally thrown away, and that anybody who has done a "write()" call will
have their recently written data show up. That's also what the naming
("invalidate") suggests.

In a non-coherent system (and remember, that's what old UNIX was, when
MS_INVALIDATE came to be), you -cannot- reasonably synchronize your caches
any other way than by throwing away your own cached copy.

(Think non-coherent CPU caches in the old non-coherent NUMA machines that
happily nobody makes any more - same exact deal. The cache ops are either
"writeback" or "throw away" or a combination of the two.)

So I don't think MS_INVALIDATE has ever really meant what you say it
means: it certainly hasn't meant it in Linux, and it cannot really have
meant it in old UNIX either because the kind of op that you imply of a
two-way coherency simply wasn't _possible_ in original unix..

Now, the "msync(0)" case _could_ very sanely mean "just synchronize with
the page cache".

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/