Re: [BUG] random kernel crashes after THP rework on s390 (maybe also on PowerPC and ARM)

From: Linus Torvalds
Date: Tue Feb 23 2016 - 12:46:16 EST


On Tue, Feb 23, 2016 at 2:32 AM, Kirill A. Shutemov
<kirill@xxxxxxxxxxxxx> wrote:
>
> I still worry about pmd_present(). It looks wrong to me. I wounder if
> patch below makes a difference.

Let's hope that's it, but in the meantime I do want to start the
discussion about what to do if it isn't. We're at rc5, and 4.5 is just
a few weeks away, and so far this issue hasn't gone anywhere.

So the *good* scenario is that your pmd_present() patch fixes it, and
we can all take a relieved breath.

But if not, what then? It looks like we have two options:

(a) do a (hopefully minimal) revert.

I say "hopefully minimal", but I suspect the revert is going to
have to undo pretty much all of the core THP changes. I'd hate to see
that, because I really liked the cleanups.

(b) mark THP as "depends on !S390" in the 4.5 release

The (b) option is obviously much simpler, but it's a regression. I
really don't like it, even if it generally shouldn't be the kind of
regression that is actually user-noticeable (apart from performance).
I also hate the fact that while the problem only seems to happen on
s390, we don't even understand it, so maybe it's a more generic issue
that for some reason just ends up being *much* more noticeable on one
odd architecture that happens to be a bit different.

I'm inclined to think of (b) as just a "give us more time to figure it
out" thing, but I'm also worried that it will then make people not
pursue this issue.

How big is a revert patch that makes THP work on s390 again? Can we do
a revert that keeps the infrastructure intact and makes it easy to
revisit the THP cleanups later? Or is the revert inevitably going to
be all the core patches in that series?

Linus