Re: hung bootup with "drm/radeon/kms: move radeon KMS on/off switchout of staging."

From: Ingo Molnar
Date: Thu Feb 04 2010 - 14:33:04 EST



* Alex Deucher <alexdeucher@xxxxxxxxx> wrote:

> On Thu, Feb 4, 2010 at 2:06 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > * Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
> >
> >> On Thu, Feb 4, 2010 at 1:12 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >> >
> >> > * Matthew Garrett <mjg59@xxxxxxxxxxxxx> wrote:
> >> >
> >> >> On Thu, Feb 04, 2010 at 06:54:45PM +0100, Ingo Molnar wrote:
> >> >>
> >> >> > But you could claim that it's not a regression because 1) technically the
> >> >> > code got introduced in drivers/staging/, and staging drivers are not on
> >> >> > the regression list 2) the Kconfig value is default-off so it can only
> >> >> > harm those who got lured by a new Kconfig value popping up in -rc7 in a
> >> >> > well working driver they already have enabled.
> >> >> >
> >> >> > So the moving of driver functionality from drivers/staging/ to drivers/
> >> >> > is a grey area it appears. Wouldnt it have been better to do this in the
> >> >> > next merge window, as all other drivers do? It's not new hardware
> >> >> > enablement either, it's feature enablement for an existing driver.
> >> >>
> >> >> The reason the option was in staging (as has been mentioned before) was
> >> >> because the ABI wasn't felt to be stable enough. Upstream is now willing to
> >> >> commit to that stability, so now seems as good a time to move it as any.
> >> >> There's no code change and there's no default configuration change, so I
> >> >> really can't see any way that it can be classed as a regression.
> >> >
> >> > But that argument in essence renders the regression policy meaningless for
> >> > such code: just about any new driver feature under the sun could be shaped as
> >> > a Kconfig option, introduced via a drivers/staging Kconfig entry, and then
> >> > activated via a twoliner commit in a later -rc.
> >> >
> >> > IMHO the point of tracking regressions is to reduce the bugginess of the
> >> > kernel and thus to help users, not to give ground for legalistic arguments.
> >> >
> >> > There _are_ common-sense exceptions from the regression rules, such as the
> >> > introduction of a new piece of hardware that was previously unsupported
> >> > (hence there's no expectation of stability) - but the tweaking of an
> >> > existing, widely used driver (even if the new opion is default-off) hardly
> >> > seems to qualify for that.
> >> >
> >>
> >> This is a completely new driver. ?It's only part of the existing drm for
> >> compatibility reasons. ?It requires an entirely different graphics stack
> >> above it and works very differently from the old drm stack.
> >
> > Will the user know? IMHO what matters in the end is user expectation.
> >
> > Lets walk through what a current kernel tester of the drm/radeon driver sees
> > when he types 'make oldconfig' after installing the (to-be-released) .33-rc7
> > kernel. Firstly, the user with a brand-new distro already has this enabled:
> >
> > ?CONFIG_DRM_RADEON=y
> >
> > and knows the driver, and it performs adequately. Then in -rc7 he gets a new
> > option:
> >
> > ?ATI Radeon (DRM_RADEON) [Y/n/?] y
> > ? ?Enable modesetting on radeon by default (DRM_RADEON_KMS) [N/y/?] (NEW)
> >
> > The user might easily go: "Hey this is a driver i already have, and there's a
> > new sub-option for this well-working driver. Sure, enable it, these kernel
> > folks know what they are doing and i rarely see any crashes past -rc2
> > kernels."
> >
> > Does this new option tell him what you just told me, that:
> >
> > ?> This is a completely new driver. ?It's only part of the existing drm for
> > ?> compatibility reasons. ?It requires an entirely different graphics stack
> > ?> above it and works very differently from the old drm stack.
> >
> > ?
> >
> > it doesnt. Even if he types '?', it tells:
> >
> > ?CONFIG_DRM_RADEON_KMS:
> >
> > ?Choose this option if you want kernel modesetting enabled by default,
> > ?and you have a new enough userspace to support this. Running old
> > ?userspaces with this enabled will cause pain.
> >
> > The user will likely go "cool I have a fresh distro with recent Xorg, lets
> > try it".
> >
>
> And if it crashes, he'll report a bug and we'll fix it.

Nobody has reacted to my related boot hang bugreport yet - and it's detailed
and fully reproducible (so i can test any proposed fixes as well in short
order). I.e. my limited testing has triggered two separate bugs in the same
driver - and this will show up in -rc7.

It might be all OK and no-one else will see trouble. Or past patterns might
repeat themselves and i might simply be an early bird for trouble to come.

My (oft repeated) point is that adding new sub-features to existing drivers
is not what we do in late -rc's: there's simply not enough time to shake out
bugs/regressions in them.

We introduce new functionality to existing drivers in the merge window - in
the two weeks following a stable kernel's release.

In late -rc's we only try to fix regressions. Sometimes we make exceptions
for pragmatic reasons, but then we are straightforward about those reasons
and try to warn users about our zeal to help them with cool, new,
not-to-be-missed GPU functionality ;-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/