Re: [PATCH 06/12] x86/mm: Enable and use the arch_pgd_init_late() method

From: Oleg Nesterov
Date: Sun Jun 14 2015 - 16:55:33 EST


On 06/14, Ingo Molnar wrote:
>
> So since we have a spin_lock() there already,

Yeeeees, I thought about task_lock() or pgd_lock too.

> Also, since this is x86 specific code we could rely on the fact that
> spinlock-acquire is a full memory barrier?

we do not really need the full barrier if we rely on spinlock_t,
we can rely on acquire+release semantics.

Lets forget about exec_mmap(). If we add, say,

// or unlock_wait() + barriers
task_lock(current->group_leader);
task_unlock(current->group_leader);

at the start of arch_pgd_init_late() we will fix the problems with
fork() even if pgd_none() below can leak into the critical section.

We rely on the fact that find_lock_task_mm() does lock/unlock too
and always starts with the group leader.

If sync_global_pgds() takes this lock first, we must see the change
in *PGD after task_unlock(). Actually right after task_lock().

Otherwise, sync_global_pgds() should see the result of list addition
if it takes this (the same) ->group_leader->lock_alloc after us.

But this is not nice, and exec_mmap() calls arch_pgd_init_late() under
task_lock().


So, unless you are going to remove pgd_lock altogether perhaps we can
rely on it the same way

mb();
spin_unlock_wait(&pgd_lock);
rmb();


Avoids the barriers (and comments) on another side, but I can't say
I really like this...


So I won't argue with 2 mb's on both sides.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/