Re: T400 suspend/resume regression -- bisected to a mystery mergecommit

From: Martin Schwidefsky
Date: Fri Oct 02 2009 - 04:02:39 EST


On Thu, 1 Oct 2009 18:21:50 -0700 (PDT)
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

>
>
> On Thu, 1 Oct 2009, Theodore Tso wrote:
> >
> > commit 8c3ee48dabee782d470cc4c7048ea64bb8b7d1cb
> > Author: Theodore Ts'o <tytso@xxxxxxx>
> > Date: Thu Oct 1 20:39:03 2009 -0400
> >
> > Revert "timekeeping: Update clocksource with stop_machine"
> >
> > This reverts commit 75c5158f70c065b9704b924503d96e8297838f79.
>
> Hmm. Looks good. But you didn't cc most of the people actually involved
> with that commit (Martin who is the author, and John who acked it).
>
> I think the revert is the right thing to do, especially as that
> 'clocksource_mutex' looks totally bogus. Either the thing is protected by
> 'stop_machine' or it's not. In neither case does it seem to make any sense
> to replace a spinlock with a mutex.
>
> And resuming anything with a big mutex is crazy anyway.
>
> That said, I do wonder if this is already fixed. See commit
> 89133f93508137231251543d1732da638e6022e1:
>
> clocksource: Resume clocksource without taking the clocksource mutex
>
> which already undid the part that probably mattered for you. That said, I
> still do think that that mutex is dubious, so maybe we should undo it all.

It whole clocksource rework started with the wish to get rid of the
change_clocksource call in update_wall_time. That call is unnecessary
in 99.9% of all cases as the clocksource does not change all the time.
In order to do that the new clocksource is activated with stop_machine.
Now if you use stop_machine you can not hold any spinlock which made
it necessary to convert the clocksource spinlock to a mutex. And we
need something to protect against concurrent clocksource changes, e.g.
one clocksource_register vs a clocksource_change_rating triggered by
the watchdog. The only other solution would be to split the clocksource
change, part 1) detection that a clocksource change is needed, part 2)
stop_machine does the clocksource selection. Right now the clocksource
to use is identified prior to the stop_machine call.

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/