Re: 2.6.35.5: hibernation broken... AGAIN

From: Hugh Dickins
Date: Wed Nov 17 2010 - 18:47:13 EST


On Wed, 17 Nov 2010, Ondrej Zary wrote:
> On Wednesday 17 November 2010 22:12:01 Rafael J. Wysocki wrote:
> > On Wednesday, November 17, 2010, Andrew Morton wrote:
> > > On Wed, 17 Nov 2010 21:53:52 +0100
> > > "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:
> > > > On Wednesday, November 17, 2010, Ondrej Zary wrote:
> > > > > Hello,
> > > > > the nasty memory-corrupting hibernation bug
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=15753 is back since
> > > > > 2.6.35.5. 2.6.35.4 works fine, 2.6.35.5 crashes after two days.

That's distressing, for both and all of us: I'm sorry.

> > > > >
> > > > > It seems to be caused by b77c254d8d66e5e9aa81239fedba9f3d568097d9.
> >
> > > commit b77c254d8d66e5e9aa81239fedba9f3d568097d9
> > > Author: Hugh Dickins <hughd@xxxxxxxxxx>
> > > Date: Thu Sep 9 16:38:09 2010 -0700
> > >
> > > swap: prevent reuse during hibernation

Embarrassing: I suspect that I've been confused, not for the first
time, by the fork-like nature of hibernation and its images.
I wonder if this patch below fixes it, Ondrej?

(And is it kernel swsusp or user swsusp that you're using? May not
matter at all, but will help us to think more clearly about it,
if the corruption remains after this patch.)

Rafael, do you agree that this patch was actually required even for
your original commit 452aa6999e6703ffbddd7f6ea124d3968915f3e3
mm/pm: force GFP_NOIO during suspend/hibernation and resume?

Or am I still just as confused? Or if not, are there more forking
places which require a similar patch?

Not signing it off yet,
Hugh

---

kernel/power/hibernate.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

--- 2.6.37-rc2/kernel/power/hibernate.c 2010-11-01 13:01:31.000000000 -0700
+++ linux/kernel/power/hibernate.c 2010-11-17 15:23:36.000000000 -0800
@@ -348,16 +348,19 @@ int hibernation_snapshot(int platform_mo
goto Recover_platform;

error = create_image(platform_mode);
- /* Control returns here after successful restore */
+ /* Control returns here after successful restore,
+ * and also after creating the image in memory (or failing to do so).
+ */

Resume_devices:
/* We may need to release the preallocated image pages here. */
- if (error || !in_suspend)
+ if (error || !in_suspend) {
swsusp_free();
+ set_gfp_allowed_mask(saved_mask);
+ }

dpm_resume_end(in_suspend ?
(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
- set_gfp_allowed_mask(saved_mask);
resume_console();
Close:
platform_end(platform_mode);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/