Re: [bisect] Merge tag 'mmc-v4.6' of git://git.linaro.org/people/ulf.hansson/mmc (was [GIT PULL] MMC for v.4.6)

From: Linus Torvalds
Date: Sun Apr 03 2016 - 07:54:15 EST


On Sat, Apr 2, 2016 at 9:56 PM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
>
> Note how mmc1 => mmcblk0 and mmc0 => mmcblk1.
>
> This produces a failure to boot as the wrong partition is mounted as
> root (/dev/mmcblk0p2 is now on the wrong mmc).

It *looks* very much like somebody is doing asynchronous probing of
the bus, meaning that the devices get probed in random order.

And that "random order" is admittedly probably usually fairly static
on any particular hardware platform, but then something happens to
change timing, and...

This is why you should never probe the actual *bus* asynchronously,
just do the end-point setup async. For example, you'd enumerate ports
(and assign devices to the ports) synchronously, but then after device
assignment the actual device probing can be async.

> The bisect tried all the mmc tree patches which were all good.
> I double-checked by cloning the mmc tree and building both mmc-v4.6
> and v4.5-rc6, and both tested good.
>
> I interpret that to mean some change in mmc + some new behavior elsewhere
> for v4.6 is causing this. Any ideas?

Hmm. If it really is just timing, it could have been around forever,
and just hidden by the fact that normally mmc0 gets probed before
mmc1, but then some other probing thing slowed down or the exact
details of the async workqueue scheduling changed, and now mmc1 just
*happens* to get probed first..

The thing that changed scheduling order could easily have come from
some non-mmc change.

NOTE! I have nothing to back this up except that (a) we've had
problems like this before and (b) it does look from your dmesg that
mmcX is simply probed in the "wrong" order. I didn't look at exactly
what mmc does or who does the probing.

Maybe Ulf can explain what it is that is _supposed_ to keep the mmc
probe order stable. Ulf?

Linus