Re: hibernate random memory corruption, workaround i915.modeset=0

From: Stanislaw Gruszka
Date: Mon Mar 19 2012 - 10:54:12 EST


On Mon, Feb 27, 2012 at 01:42:43PM +0100, Stanislaw Gruszka wrote:
> I'm able to reproduce random memory corruption after hibernate.
> Corruption is not reproducible when I disable mode setting, what
> seems to blame i915 driver or generic DRM kernel code.
>
> I'm able to reproduce bug on Fedora 11 with 2.6.30 kernel (first
> fedora with KMS support) and on the latest 3.3-rc kernels. So this
> issue is there from very beginning, hence it is not bisectable.
>
> I'm attaching script to reproduce (with accompanying memory checker
> program). Script is basically sequence of hibernate - reset - check
> memory. Kernel should be compiled with CONFIG_DEBUG_SLAB to detect
> poison/redzone overwrites.
>
> I already tried to debug this using CONFIG_DEBUG_PAGEALLOC and new
> kernel option debug_guardpage_minorder, but without any success.
> Seems corruption happen behind CPU MMU, i.e. is DMA unit programming
> bug. I'm not able to turn on IOMMU on that hardware.
>
> This happen on T500 laptop with, lspci output attached.
>
> I'm attaching also dmesg's with poison/redzone overwrites from
> 3.3-rc4 and 2.6.30 kernels.
>
> Some more information can be found on:
> https://bugzilla.redhat.com/show_bug.cgi?id=746169
> https://bugzilla.redhat.com/show_bug.cgi?id=701857
>
> i.e there is invalid DMA address warning that could be a good hint:
> https://bugzilla.redhat.com/show_bug.cgi?id=746169#c7
>
> I would appreciate any help with solving this issue. I think many
> people are hitting this, but since corruption happens at random,
> not many people notice it, or when notice, did not find out that
> this could be i915/DRM issue.

So, after googling a bit I find out that we are writing pixels into
memory and issue is known since 2010 at least:
http://codemonkey.org.uk/2012/03/12/i915-hibernate-memory-corruption/
https://bugzilla.novell.com/show_bug.cgi?id=697699
https://bugzilla.kernel.org/show_bug.cgi?id=13811
https://bugzilla.kernel.org/show_bug.cgi?id=37142

Keith, is there a chance that this bug can be fixed by i915 team?
If not, can we disable hibernate on i915 with modeset=1 and add
module option, which enable it for those who want to risk?

Thanks
Stanislaw

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/