Re: [PATCH] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and _NONVOLATILEflags

From: Rik van Riel
Date: Tue Nov 22 2011 - 05:46:04 EST


On 11/22/2011 04:37 AM, Rik van Riel wrote:
On 11/21/2011 10:33 PM, John Stultz wrote:
This patch provides new fadvise flags that can be used to mark
file pages as volatile, which will allow it to be discarded if the
kernel wants to reclaim memory.

This is useful for userspace to allocate things like caches, and lets
the kernel destructively (but safely) reclaim them when there's memory
pressure.

Right now, we can simply throw away pages if they are clean (backed
by a current on-disk copy). That only happens for anonymous/tmpfs/shmfs
pages when they're swapped out. This patch lets userspace select
dirty pages which can be simply thrown away instead of writing them
to disk first. See the mm/shmem.c for this bit of code. It's
different from FADV_DONTNEED since the pages are not immediately
discarded; they are only discarded under pressure.

I've got a few questions:

1) How do you tell userspace some of its data got
discarded?

2) How do you prevent the situation where every
volatile object gets a few pages discarded, making
them all unusable?
(better to throw away an entire object at once)

3) Isn't it too slow for something like Firefox to
create a new tmpfs object for every single throw-away
cache object?

Oh, and a fourth issue with the _VOLATILE approach, which
I forgot to write down before:

4) Virtualization. Marking an object (and its pages)
_VOLATILE inside a guest will not be visible on the
host side, which means a virtual system may continue
to suffer the performance penalty anyway.

On the other hand, the approach I outlined will simply result
in a virtual machine being asked to reduce its memory, and
possibly later on passing that notification on to the programs
running inside. In other words, the "please shrink your caches"
notification naturally recurses into cgroups and virtual machines.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/