Re: [patch 1/6] mmu_notifier: Core code

From: Peter Zijlstra
Date: Tue Feb 05 2008 - 13:18:44 EST



On Tue, 2008-02-05 at 18:05 +0000, Andy Whitcroft wrote:

> > + if (unlikely(!hlist_empty(&mm->mmu_notifier.head))) {
> > + rcu_read_lock();
> > + hlist_for_each_entry_safe_rcu(mn, n, t,
> > + &mm->mmu_notifier.head, hlist) {
> > + if (mn->ops->release)
> > + mn->ops->release(mn, mm);
>
> Does this ->release actually release the 'nm' and its associated hlist?
> I see in this thread that this ordering is deemed "use after free" which
> implies so.
>
> If it does that seems wrong. This is an RCU hlist, therefore the list
> integrity must be maintained through the next grace period in case there
> are parallell readers using the element, in particular its forward
> pointer for traversal.

That is not quite so, list elements must be preserved, not the list
order.

>
> > + hlist_del(&mn->hlist);
>
> For this to be updating the list, you must have some form of "write-side"
> exclusion as these primatives are not "parallel write safe". It would
> be helpful for this routine to state what that write side exclusion is.

Yeah, has been noticed, read on in the thread :-)

> I am not sure it makes sense to add a _safe_rcu variant. As I understand
> things an _safe variant is used where we are going to remove the current
> list element in the middle of a list walk. However the key feature of an
> RCU data structure is that it will always be in a "safe" state until any
> parallel readers have completed. For an hlist this means that the removed
> entry and its forward link must remain valid for as long as there may be
> a parallel reader traversing this list, ie. until the next grace period.
> If this link is valid for the parallel reader, then it must be valid for
> us, and if so it feels that hlist_for_each_entry_rcu should be sufficient
> to cope in the face of entries being unlinked as we traverse the list.

It does make sense, hlist_del_rcu() maintains the fwd reference, but it
does unlink it from the list proper. As long as there is a write side
exclusion around the actual removal as you noted.

rcu_read_lock();
hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) {

if (foo) {
spin_lock(write_lock);
hlist_del_rcu(tpos);
spin_unlock(write_unlock);
}
}
rcu_read_unlock();

is a safe construct in that the list itself stays a proper list, and
even items that might be caught in the to-be-deleted entries will have a
fwd way out.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/