Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache mayget wrongly discarded

From: Andrew Morton
Date: Wed Mar 11 2009 - 20:12:09 EST


On Wed, 11 Mar 2009 15:30:35 +0000
David Howells <dhowells@xxxxxxxxxx> wrote:

> From: Enrik Berkhan <Enrik.Berkhan@xxxxxx>
>
> The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> done by SYSV SHM for example - may get discarded under memory pressure.
>
> The problem is that the pages are not marked dirty. Anything that creates data
> in an MMU-based ramfs will cause the pages holding that data will cause the
> set_page_dirty() aop to be called.
>
> For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> won't be called by page-writing faults on writable mmaps, and it isn't called
> by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> to allocate a contiguous run.
>
> The solution is to mark the pages dirty at the point of allocation by
> the truncation code.
>
> Signed-off-by: Enrik Berkhan <Enrik.Berkhan@xxxxxx>
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> ---
>
> fs/ramfs/file-nommu.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
>
> diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
> index b9b567a..90d72be 100644
> --- a/fs/ramfs/file-nommu.c
> +++ b/fs/ramfs/file-nommu.c
> @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
> if (!pagevec_add(&lru_pvec, page))
> __pagevec_lru_add_file(&lru_pvec);
>
> + /* prevent the page from being discarded on memory pressure */
> + SetPageDirty(page);
> +
> unlock_page(page);
> }

Was there a specific reason for using the low-level SetPageDirty()?

On the write() path, ramfs pages will be dirtied by
simple_commit_write()'s set_page_dirty(), which calls
__set_page_dirty_no_writeback().

It just so happens that __set_page_dirty_no_writeback() is equivalent
to a simple SetPageDirty() - it bypasses all the extra things which we
do for normal permanent-storage-backed pages.

But I'd have thought that it would be cleaner and more maintainable (albeit
a bit slower) to go through the a_ops?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/