Re: [patch 2.6.13-rc4] fix get_user_pages bug

From: Hugh Dickins
Date: Tue Aug 02 2005 - 12:23:35 EST


On Tue, 2 Aug 2005, Linus Torvalds wrote:
> On Tue, 2 Aug 2005, Hugh Dickins wrote:
> >
> > But have I just realized a non-s390 problem with your pte_dirty
> > technique? The ptep_set_wrprotect in fork's copy_one_pte.
> >
> > That's specifically write-protecting the pte to force COW, but leaving
> > the dirty bit: so now get_user_pages will skip COW-ing it (in all write
> > cases, not just the peculiar ptrace force one).
>
> Damn, you're right. We could obviously move the dirty bit from the page
> tables to the "struct page" in fork() (that may have other advantages:
> we're scanning the dang thing anyway, after all) to avoid that special
> case, but yes, that's nasty.

It might not be so bad. It's going to access the struct page anyway.
And clearing dirty from parent and child at fork time could save two
set_page_dirtys at exit time. But I'm not sure that we could batch the
the dirty bit clearing into one TLB flush like we do the write protection.

> In fact, that brings up another race altogether: a thread that does a
> fork() at the same time as get_user_pages() will have the exact same
> issues. Even with the old code. Simply because we test the permissions on
> the page long before we actually do the real access (ie it may be dirty
> and writable when we get it, but by the time the write happens, it might
> have become COW-shared).
>
> Now, that's probably not worth worrying about, but it's kind of
> interesting.

Not worth worrying about in this context: it's one aspect of the
InfiniBand (RDMA) issue I was referring to, to be addressed another time.

Or is it even possible? We do require the caller of get_user_pages to
down_read(&mm->mmap_sem), and fork parent has down_write(&mm->mmap_sem).

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/