Re: [PATCH v2] introduce atomic_pointer to fix a race condition in cancelable mcs spinlocks

From: Mikulas Patocka
Date: Mon Jun 02 2014 - 17:13:23 EST




On Mon, 2 Jun 2014, Linus Torvalds wrote:

> On Mon, Jun 2, 2014 at 1:46 PM, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
> >
> > And what else do you want to do?
> >
> > Peter Zijlstra said "I've been using xchg() and cmpxchg() without such
> > consideration for quite a while." - so it basically implies that the
> > kernel is full of such races, mcs_spinlock is just the most visible one
> > that crashes the kernel first.
>
> .. so your whole argument is bogus, because it doesn't actually fix
> anything else.
>
> Now, something that *would* fix something else is (for example) to
> just make "ACCESS_ONCE()" a rvalue so that you cannot use it for
> assignments, and then trying to sort out what happens then. It's
> possible that the "atomic_pointer_t" would be a part of the solution
> to that "what happens then", but THERE IS NO WAY IN HELL we're adding
> it for just one architecture and one use that doesn't warrant even
> _existing_ on that architecture.

The patch adds atomic_pointer_t for all architectures - it is in the
common code and it is backed by atomic_long_t (that already exists for all
architectures). There is no new arch-specific code at all.

When we have atomic_pointer_t, we can find the instances of xchg() and
cmpxchg() and convert them to atomic_pointer_t (or to other atomic*_t
types).

When we convert them all, we can drop xchg() and cmpxchg() at all (at
least from architecture-neutral code).

The problem with xchg() and cmpxchg() is that they are very easy to
misuse. Peter Zijlstra didn't know that they are not atomic w.r.t. normal
stores, a lot of other people don't know it too - and if we allow these
functions to be used, this race condition will reappear in the future
again and again.

That's why I'm proposing atomic_pointer_t - it guarantees that this race
condition can't be made.

> See what I'm saying?
>
> You're not fixing the problem, you're fixing one unimportant detail
> that isn't worth fixing that way.
>
> Linus

Regarding reworking ACCESS_ONCE() for reads and writes - the problem is -
how do you make people use it? ACCESS_ONCE() is already missing at a lot
of places (it doesn't cause any visible bug on the condition that the
compiler doesn't split the load or store to multiple accesses), I can
assume that people will omit ATOMIC_ONCE_STORE() too even if we make it.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/