Re: [PATCH 4.14 05/69] x86/power: Make restore_processor_context() sane

From: Sasha Levin
Date: Tue Apr 16 2019 - 09:22:08 EST


On Tue, Apr 16, 2019 at 01:56:44PM +0200, Pavel Machek wrote:
On Mon 2019-04-15 13:18:56, Andy Lutomirski wrote:
On Mon, Apr 15, 2019 at 1:04 PM Pavel Machek <pavel@xxxxxxx> wrote:
>
> On Mon 2019-04-15 20:58:23, Greg Kroah-Hartman wrote:
> > [ Upstream commit 7ee18d677989e99635027cee04c878950e0752b9 ]
> >
> > My previous attempt to fix a couple of bugs in __restore_processor_context():
> >
> > 5b06bbcfc2c6 ("x86/power: Fix some ordering bugs in __restore_processor_context()")
> >
> > ... introduced yet another bug, breaking suspend-resume.
> >
> > Rather than trying to come up with a minimal fix, let's try to clean it up
> > for real. This patch fixes quite a few things:
>
> 5b06bbcfc2c6 fixed theoretical bug; rather than porting it to stable
> than fixing it up, it would be better not to port it to stable in the
> first place or simply revert it there.

Are you sure about that? The bug was reported by real users who had
their systems really crash:

https://lore.kernel.org/lkml/?q=0fede9f9-88b0-a6e7-1027-dfb2019b8ef2%40linux.intel.com

https://lore.kernel.org/lkml/CA+55aFwsMuHUBQz5kDNwRf17JnasXMWjvmLq5qXGH-694yeq1w@xxxxxxxxxxxxxx/

And we had a report that the bug got backported:

https://lore.kernel.org/stable/20190407160005.djiw4reapwvbxmgo@debian/

And if we're going to backport some of the fix, we should definitely
backport the whole set to avoid having the -stable kernels be in a
state that was never in any released kernel.

I agree it should be all or nothing. And I may have been slightly
confused.

Anyway: 5b06bbcfc2c6 is fix for ca37e57bbe0cf, and that one is for
"mostly harmless warning" (quoting changelog). ca37e57bbe0cf is not
present in 4.4.178, I believe best solution is not to add that one in
the first place, so we don't have to fix it up.

5b06bbcfc2c6 describes in it's commit message two bugs that it fixes:
one, as you pointed out, is fixing an issue introduced by ca37e57bbe0cf.
The other one which to my understanding is unrelated to ca37e57bbe0cf,
is fixing "resume when the userspace process that initiated suspect had
funny segments", and provides a straightforward test case for that
issue.


--
Thanks,
Sasha