hung bootup with "drm/radeon/kms: move radeon KMS on/off switch outof staging."

From: Ingo Molnar
Date: Thu Feb 04 2010 - 02:19:06 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> * Dave Airlie <airlied@xxxxxxxxx> wrote:
>
> > On Wed, Feb 3, 2010 at 1:46 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
> > >
> > > * Dave Airlie <airlied@xxxxxxxxx> wrote:
> > >
> > >> On Tue, Feb 2, 2010 at 6:17 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> > >> >
> > >> > * Dave Airlie <airlied@xxxxxxxx> wrote:
> > >> >
> > >> >> > Hi Linus,
> > >> >> >
> > >> >> > Please pull the 'drm-linus' branch from
> > >> >> > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-linus
> > >> >> >
> > >> >>
> > >> >> I've also added an oops fix I seem to lose off my radar to this tree.
> > >> >>
> > >> >> commit 17aafccab4352b422aa01fa6ebf82daff693a5b3
> > >> >> Author: Michel D??nzer <daenzer@xxxxxxxxxx>
> > >> >> Date: ? Fri Jan 22 09:20:00 2010 +0100
> > >> >>
> > >> >> ? ? drm/radeon/kms: Fix oops after radeon_cs_parser_init() failure.
> > >> >
> > >>
> > >> Wierd this suggests something else is wrong on that machine can you get me
> > >> the whole dmesg? I'm guessing some iommu or swiotlb issue.
> > >
> > > This box has no known hardware or software problems, just this week it booted
> > > in excess of 1000 kernels so i'd exclude that angle for now.
> > >
> > > I have bisected the crash back to the DRM tree and the crash went away with
> > > the Kconfig revert i applied - and it got fixed by Jerome's patch. I posted
> > > my config and i posted the relevant boot log as well. Find below the full
> > > bootlog as well with vanilla -git (ab65832) and the config. (i dont think it
> > > matters)
> > >
> > >> I've asked Jerome to fix the oops, but really anyone with an old .config
> > >> won't get hit by this, and we've booted this on quite a lot of machines at
> > >> this point.
> > >
> > > I dont see the commit in yesterday's linux-next. It has very fresh
> > > timestamps:
> > >
> > > ?commit f71d0187987e691516cd10c2702f002c0e2f0edc
> > > ?Author: ? ? Dave Airlie <airlied@xxxxxxxxxx>
> > > ?AuthorDate: Mon Feb 1 11:35:47 2010 +1000
> > > ?Commit: ? ? Dave Airlie <airlied@xxxxxxxxxx>
> > > ?CommitDate: Mon Feb 1 11:35:47 2010 +1000
> > >
> > > What kind of widespread testing could this commit have gotten in the less
> > > than 24 hours before it hit mainline?
> > >
> >
> > Its shipping in a major distro by default, its planned to be shipped in an
> > even more major distro. Its been boot tested on 1000s of machines by 1000s
> > of ppl.
>
> Well but that's not the precise tree you sent to Linus, is it?

btw., i just found another bug activated via this same commit, a boot hang
after DRM init:

[ 9.858352] [drm] Connector 1:
[ 9.861417] [drm] DVI-I
[ 9.864031] [drm] HPD1
[ 9.866562] [drm] DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
[ 9.872579] [drm] Encoders:
[ 9.875540] [drm] CRT2: INTERNAL_DAC2
[ 9.879541] [drm] DFP1: INTERNAL_TMDS1
[ 9.883646] [drm] Connector 2:
[ 9.886695] [drm] S-video
[ 9.889483] [drm] Encoders:
[ 9.892463] [drm] TV1: INTERNAL_DAC2
[ 9.896392] i2c i2c-0: master_xfer[0] W, addr=0x50, len=1
[ 9.901796] i2c i2c-0: master_xfer[1] R, addr=0x50, len=128
[ 9.909246] i2c i2c-0: NAK from device addr 0x50 msg #0
[ 9.914564] i2c i2c-1: master_xfer[0] W, addr=0x50, len=1
[ 9.919957] i2c i2c-1: master_xfer[1] R, addr=0x50, len=128
[ 9.927413] i2c i2c-1: NAK from device addr 0x50 msg #0

(i power cycled the box after 45 minutes of waiting.)

The hang goes away if i revert commit f71d0187987e6 via the patch below, the
boot sequence becomes:

[ 9.068911] calling drm_core_init+0x0/0x137 @ 1
[ 9.073617] [drm] Initialized drm 1.1.0 20060810
[ 9.078232] initcall drm_core_init+0x0/0x137 returned 0 after 4586 usecs
[ 9.120162] [drm] radeon defaulting to userspace modesetting.
[ 9.154295] [drm] Initialized radeon 1.31.0 20080528 for 0000:01:00.0 on minor 0

Config and bootlog attached.

Ingo

------------->