Re: [PATCH] mm, fs: avoid page allocation beyond i_size on read

From: Steven Whitehouse
Date: Wed Aug 21 2013 - 12:44:15 EST


Hi,

On Wed, 2013-08-21 at 19:08 +0300, Kirill A. Shutemov wrote:
> Steven Whitehouse wrote:
> > Hi,
> >
> > On Wed, 2013-08-21 at 18:37 +0300, Kirill A. Shutemov wrote:
> > > I've noticed that we allocated unneeded page for cache on read beyond
> > > i_size. Simple test case (I checked it on ramfs):
> > >
> > > $ touch testfile
> > > $ cat testfile
> > >
> > > It triggers 'no_cached_page' code path in do_generic_file_read().
> > >
> > > Looks like it's regression since commit a32ea1e. Let's fix it.
> > >
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > Acked-by: NeilBrown <neilb@xxxxxxx>
> > > ---
> > > mm/filemap.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/mm/filemap.c b/mm/filemap.c
> > > index 1905f0e..b1a4d35 100644
> > > --- a/mm/filemap.c
> > > +++ b/mm/filemap.c
> > > @@ -1163,6 +1163,10 @@ static void do_generic_file_read(struct file *filp, loff_t *ppos,
> > > loff_t isize;
> > > unsigned long nr, ret;
> > >
> > > + isize = i_size_read(inode);
> > > + if (!isize || index > (isize - 1) >> PAGE_CACHE_SHIFT)
> > > + goto out;
> > > +
> > > cond_resched();
> > > find_page:
> > > page = find_get_page(mapping, index);
> >
> > Please don't do that... there is no reason to think that i_size will be
> > correct at that moment. Why not just get readpage(s) to return the
> > correct return code in that case?
>
> I work on THP for page cache. Allocation and clearing a huge page for
> nothing is pretty expensive.
>
Hmm, thats true, but I wonder how often it is likely to happen...

> I don't think the change is harmful. The worst case scenario is race with
> write or truncate, but it's valid to return EOF in this case.
>
> What scenario do you have in mind?
>

1. File open on node A
2. Someone updates it on node B by extending the file
3. Someone reads the file on node A beyond end of original file size,
but within end of new file size as updated by node B. Without the patch
this works, with it, it will fail. The reason being the i_size would not
be up to date until after readpage(s) has been called.

I think this is likely to be an issue for any distributed fs using
do_generic_file_read(), although it would certainly affect GFS2, since
the locking is done at page cache level,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/