Suspend regression on 3.7-rc with nouveau

From: Herton Ronaldo Krzesinski
Date: Tue Nov 13 2012 - 12:58:13 EST


Hi,

since 3.7-rc1 I'm unable to suspend my "Optimus" based laptop, suspend
aborts, this is the relevant log snippet with debugging enabled up to
tracing in nouveau:

[ 65.941744] nouveau [ DRM] suspending fbcon...
[ 65.941746] nouveau [ DRM] suspending display...
[ 65.942108] nouveau [ DRM] unpinning framebuffer(s)...
[ 65.942176] nouveau [ DRM] evicting buffers...
[ 65.942749] ACPI handle has no context!
[ 65.942941] nouveau [ DRM] suspending client object trees...
[ 65.942942] nouveau D[ 1281] suspend running
[ 65.942944] nouveau T[ 1281] 0xffffffff:0xffffffff suspend children
[ 65.942946] nouveau T[ 1281] 0xffffffff:0xffffffff suspend running
[ 65.942947] nouveau T[ 1281] use(-1) == 1
[ 65.942949] nouveau T[ 1281] 0xffffffff:0xffffffff suspend completed
[ 65.942949] nouveau D[ 1281] suspend completed with 0
[ 65.942950] nouveau D[ DRM] suspend running
[ 65.942952] nouveau T[ DRM] 0xffffffff:0xffffffff suspend children
[ 65.942953] nouveau T[ DRM] 0xffffffff:0xdddddddd suspend children
[ 65.942954] nouveau T[ DRM] 0xdddddddd:0xcccc0000 suspend children
[ 65.942956] nouveau T[ DRM] 0xcccc0000:0x000490b5 suspend children
[ 65.942957] nouveau T[ DRM] 0xcccc0000:0x000490b5 suspend running
[ 65.942959] nouveau T[ PCE0][0000:01:00.0][0x000090b5][ffff880132932920] use(-1) == 0
[ 65.942961] nouveau T[ PCE0][0000:01:00.0][0x000090b5][ffff880132932920] suspending...
[ 65.942962] nouveau T[ PCE0][0000:01:00.0] use(-1) == 1
[ 65.942964] nouveau T[ PCE0][0000:01:00.0][0x0300c01b][ffff8801373fbc00] use(-1) == 0
[ 65.942965] nouveau T[ PCE0][0000:01:00.0][0x0300c01b][ffff8801373fbc00] suspending...
[ 65.942993] nouveau E[ PFIFO][0000:01:00.0] write fault at 0x0000000000 [PAGE_NOT_PRESENT] from PCOPY0/PCOPY0 on channel 0x003fe1b000
[ 65.949760] ACPI handle has no context!
[ 66.120887] e1000e 0000:00:19.0: wake-up capability enabled by ACPI
[ 66.237094] i915 0000:00:02.0: power state changed by ACPI to D3hot
[ 67.938439] nouveau E[ PFIFO][0000:01:00.0] channel 0 kick timeout
[ 67.938441] nouveau E[ PFIFO][0000:01:00.0][0xc000906f][ffff880132ad2480] failed to detach PCE0 context, -16
[ 67.938442] nouveau E[ PCE0][0000:01:00.0][0x0300c01b][ffff8801373fbc00] failed suspend, -16
[ 67.938443] nouveau W[ PCE0][0000:01:00.0][0x000090b5][ffff880132932920] parent failed suspend, -16
[ 67.938448] nouveau T[ PCE0][0000:01:00.0] use(+1) == 2
[ 67.938450] nouveau E[ DRM] 0xcccc0000:0x000490b5 suspend failed with -16
[ 67.938451] nouveau E[ DRM] 0xdddddddd:0xcccc0000 suspend failed with -16
[ 67.938452] nouveau E[ DRM] 0xffffffff:0xdddddddd suspend failed with -16
[ 67.938453] nouveau E[ DRM] 0xffffffff:0xffffffff suspend failed with -16
[ 67.938455] nouveau D[ DRM] suspend completed with -16
[ 67.938455] nouveau D[ 1281] init running
[ 67.938456] nouveau T[ 1281] 0xffffffff:0xffffffff init running
[ 67.938457] nouveau T[ 1281] use(+1) == 2
[ 67.938458] nouveau T[ 1281] 0xffffffff:0xffffffff init children
[ 67.938459] nouveau T[ 1281] 0xffffffff:0xffffffff init completed
[ 67.938460] nouveau D[ 1281] init completed with 0
[ 67.938461] nouveau [ DRM] resuming display...
[ 68.241664] pci_legacy_suspend(): nouveau_drm_suspend+0x0/0x270 [nouveau] returns -16
[ 68.241683] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -16
[ 68.241685] PM: Device 0000:01:00.0 failed to suspend async: error -16
[ 68.241810] PM: Some devices failed to suspend

I attached the almost full dmesg (gzipped). I did some investigation, and
tried to bisect, but code shuffled a lot and it isn't very useful, the
problem is well isolated on suspend path anyway. I started to have problems
with commit e193b1d42c390bf1bff7fa02a5a1202b98e75601, but with this commit
as HEAD the machine actually freezes when trying to suspend, probably at
nvc0_fence_suspend (not checked). But the code got changed a lot again
with commit ebb945a94bba2ce8dff7b0942ff2b3f2a52a0a69 (the failure seems
to be coming from nvc0_fifo_context_detach). With latest 3.7-rc5 I can't
suspend like the log shows, but at least the machine doesn't freeze.

--
[]'s
Herton

Attachment: dmesg-52.txt.gz
Description: Binary data