Re: [PATCH 01/19] mm: Make default vm_ops provide ->page_mkwritehandler

From: Jan Kara
Date: Wed Mar 28 2012 - 05:49:07 EST


On Wed 28-03-12 09:45:53, Dave Chinner wrote:
> On Wed, Mar 28, 2012 at 12:08:19AM +0200, Jan Kara wrote:
> > On Tue 27-03-12 14:38:15, Andrew Morton wrote:
> > > On Tue, 27 Mar 2012 09:55:27 +0200
> > > Jan Kara <jack@xxxxxxx> wrote:
> > > > On Fri 23-03-12 15:45:02, Andrew Morton wrote:
> > > > > On Mon, 5 Mar 2012 17:00:59 +0100
> > > > > Jan Kara <jack@xxxxxxx> wrote:
> > > > >
> > > > > > --- a/mm/filemap.c
> > > > > > +++ b/mm/filemap.c
> > > > > > @@ -1759,8 +1759,28 @@ page_not_uptodate:
> > > > > > }
> > > > > > EXPORT_SYMBOL(filemap_fault);
> > > > > >
> > > > > > +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > > > > +{
> > > > > > + struct page *page = vmf->page;
> > > > > > + struct inode *inode = vma->vm_file->f_path.dentry->d_inode;
> > > > > > + int ret = VM_FAULT_LOCKED;
> > > > > > +
> > > > > > + file_update_time(vma->vm_file);
> > > > > > + lock_page(page);
> > > > > > + if ((page->mapping != inode->i_mapping) ||
> > > > > > + (page_offset(page) > i_size_read(inode))) {
> > > > >
> > > > > Would benefit from a comment explaining how the page can come to be
> > > > > outside i_size, and why we fail in that case.
> > >
> > > This?
> > >
> > > > > I don't think i_mutex is held here, so this test is rather meaningless
> > > > > and racy anyway?
> > > > i_size test is racy if that's what you mean by "this test". Just I did
> > > > the test this way because it's like this in other places and I figured
> > > > truncate_pagecache() can take relatively long time so the test has some
> > > > effect. But if you think it's not worth it, I can remove it.
> > >
> > > It bugs me when we copy-n-paste code without remembering why we had it
> > > there in the first place :( iirc, mmapped pages outside i_size can and
> > > do happen in some race situations, and are benign.
> > Yeah. Certainly there can be pages beyond i_size because we first set
> > file size and then go and remove pages beyond new i_size one by one when we
> > do truncate. We must be careful not to create any new pages beyond i_size
> > but that's what filemap_fault() takes care of. So I think i_size check in
> > ->page_mkwrite() isn't strictly needed.
>
> Actually, I think it is. In __do_fault(), we drop the page lock
> between the .fault call and the .page_mkwrite() call, so the size
> checks in .fault for the given page being faulted are no longer
> valid as truncate serialises only on the page lock. Hence we have to
> repeat the truncate race checks again in .page_mkwrite() after we
> relock the page.
I was thinking about this scenario as well but I think doesn't cause any
problems. But maybe there's some noise in what the condition actually is.
So I currently have:
if (page->mapping != inode->i_mapping)
bail..
do stuff.
And I think that is enough. Because only filemap_fault() creates new page,
->page_mkwrite() only uses the page reference from vmf->page. So the only
thing that can happen after __do_fault() drops page lock is that
truncate_pagecache() will go and truncate the page (thus page->mapping will
be NULL) and the condition I currently have should be enough to catch that.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/