Re: invalidate_inode_pages in 2.5.32/3

From: Daniel Phillips (phillips@arcor.de)
Date: Mon Sep 09 2002 - 16:44:08 EST


> I'm very unkeen about using the inaccurate invalidate_inode_pages
> for anything which matters, really. And the consistency of pagecache
> data matters.
>
> NFS should be using something stronger. And that's basically
> vmtruncate() without the i_size manipulation.

Yes, that looks good. Semantics are basically "and don't come back
until every damm page is gone" which is enforced by the requirement
that we hold the mapping->page_lock though one entire scan of the
truncated region. (Yes, I remember sweating this one out a year
or two ago so it doesn't eat 100% CPU on regular occasions.)

So, specifically, we want:

void invalidate_inode_pages(struct inode *inode)
{
        truncate_inode_pages(mapping, 0);
}

Is it any harder than that?

By the way, now that we're all happy with the radix tree, we might
as well just go traverse that instead of all the mapping->*_pages.
(Not that I'm seriously suggesting rocking the boat that way just
now, but it might yield some interesting de-crufting possibilities.)

> Hold i_sem,
> vmtruncate_list() for assured pagetable takedown, proper page
> locking to take the pages out of pagecache, etc.
>
> Sure, we could replace the page_count() heuristic with a
> page->pte.direct heuristic. Which would work just as well. Or
> better. Or worse. Who knows?
>
> Guys, can we sort out the NFS locking so that it is possible to
> take the correct locks to get the 100% behaviour?

Trond, will the above work?

Now, what is this invalidate_inode_pages2 seepage about? Called from
one place. Sheesh.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:18 EST