Re: sched: softlockups in multi_cpu_stop

From: Linus Torvalds
Date: Fri Mar 06 2015 - 14:33:03 EST


On Fri, Mar 6, 2015 at 11:20 AM, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
>
> I obviously agree with all those points, however fyi most of the testing
> on rwsems I do includes scaling address space ops stressing the
> mmap_sem, which is a real world concern. So while it does include
> microbenchmarks, it is not guided by them.

So I agree that mmap_sem is problematic.

We probably still end up holding it over many actual IO operations,
for example. The whole "FAULT_RETRY" thing should have helped a lot,
in that hopefully at least a fair amount of the time we now end up
waiting for the IO without holding the semaphore, but I bet many other
cases remain.

And I also suspect that we could try to be even more aggressive, and
allow some entirely unlocked cases. For example, long long ago we used
to have a completely SMP-unsafe model where we would do things
optimistically - doing IO without holding any locks, and then before
we "committed" to it, we'd re-try. And I wonder if we might want to
re-introduce that for the cases where we hit in caches and could use
RCU.

IOW, I wonder if we could special-case the common non-IO
fault-handling path something along the lines of:

- look up the vma in the vma lookup cache
- look up the page in the page cache
- get the page table spinlock
- re-check the vma now (it ends up being stable if it can't be torn
down due to the page table spinlock)

because I suspect that page faults are the biggest users of that
mmap_sem, and we could probably handle a fairly large common case
(making it simpler by special-casing it and punting in any even
_slightly_ complicated situations) without even getting the semaphore
at all, since we have to serialize on the actual page table *anyway*.

Basically, to me, the whole "if a lock is so contended that we need to
play locking games, then we should look at why we *use* the lock,
rather than at the lock itself" is a religion.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/