Re: [PATCH] ppc64: Fix possible race with set_pte on a present PTE

From: Benjamin Herrenschmidt
Date: Tue May 25 2004 - 23:18:41 EST


On Wed, 2004-05-26 at 14:08, Linus Torvalds wrote:

> You're right. We do use it on the do_wp_page() path, and there we actually
> use a whole new page in the "break_cow()" case. That case is in fact
> fundamentally different from the other ones.
>
> So we should probably break up the "ptep_establish()" into its two pieces,
> since the callers don't actually want to do the same thing. One really
> wants to do a "clear old one, set a totally new one", and the two other
> places want to actually update just the dirty and accessed bits.

The first one could still be called "pte_establish" ... I mean, it makes
little sense to continue calling "pte_establish" something that just
does set one of those 2 bits... And the flush done by pte_establish in
this case is useless on ppc... so I'd rather kill pte_establish
completely instead and define it's arch (or generic) impl. of
ptep_set_dirty_accessed() responsibility to do the TLB flush if
necessary, no ? (well, a call to it on ppc isn't expensive if we didn't
add anything to the batch anyway...)

> In fact, the only non-generic user of "ptep_establish()" (s390) didn't
> want to use the generic version exactly because of this very conceptual
> bug. It uses "ptep_clear_flush()" for the replacement case, which actually
> makes sense.
>
> So does it work if you do this appended patch first? This is a real
> cleanup, and I think it will allow us to get rid of the s390-specific code
> in ptep_establish(). Along with hopefully fixing your problem too.
>
> After this, we should be able to have a BUG() in "set_pte()" if the entry
> wasn't clear before (assuming the arch doesn't use set_pte() for the dirty
> updates etc).

Ok, I'll give it a spin.

> Linus
>
> ---
> ===== mm/memory.c 1.177 vs edited =====
> --- 1.177/mm/memory.c Tue May 25 12:37:09 2004
> +++ edited/mm/memory.c Tue May 25 21:04:49 2004
> @@ -1004,7 +1004,10 @@
> flush_cache_page(vma, address);
> entry = maybe_mkwrite(pte_mkdirty(mk_pte(new_page, vma->vm_page_prot)),
> vma);
> - ptep_establish(vma, address, page_table, entry, 1);
> +
> + /* Get rid of the old entry, replace with new */
> + ptep_clear_flush(vma, address, page_table);
> + set_pte(page_table, entry);
> update_mmu_cache(vma, address, entry);
> }
>
--
Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/