Re: stable? quality assurance?

From: Martin Steigerwald
Date: Sat Sep 04 2010 - 15:11:53 EST


Am Samstag 04 September 2010 schrieb Ted Ts'o:
> On Sat, Sep 04, 2010 at 06:38:59PM +0200, Martin Steigerwald wrote:
> > During bisecting [Bug 16376] random - possibly Radeon DRM KMS related
> > freezes, which goes very slowly due to having lots of unbootable
> > kernels
>
> > with an ext4 / readahead related backtrace during boot, I had an idea:
> So I'm not sure what you're referring to here. If there's an ext4
> bug, why haven't you reported it to the linux-ext4 list? I've done a
> Google search for "Steigerwald ext4 readahead" and I can't find any
> bug report related to kernel oops that are ext4/readahead-related.
>
> No one else has reported such a bug to me, and I run a complete set of
> regression tests before I push ext4 changes to Linus. So I'm not sure
> what you're seeing. But complaining about it in passing on an e-mail
> without sending a formal bug report to the linux-ext4 mailing list is
> not likely to solve your problem...

Stop! I think we are misunderstanding.

Its a bug I stumpled across the bisecting process. Neither 2.6.33 or
2.6.34 are affected, but some kernels in between. As such I didn't think
its worth reporting the bug.

I made a photo of part of the backtrace tough, so if you want I open a bug
report about it nonetheless. But I really think it has been fixed during
the 2.6.33 to 2.6.34 development cycle.

For now I just skipped affected kernels in the bisection process in the
hope that none is the first last good or first bad one regarding the freeze
bug. Since for now it has all been kernels of a usb merge that showed this
issue, I don't think the freeze bug is in there.

Its from:

# skip: [124d255382ddd37ffa920e9f5183efa54bbfe4f2] USB: pl2303: remove
unnecessary reset of usb_device in urbs

to

# skip: [c68bb0d738897ed39b90c7ccb22e01c938117051] USB: cxacru: document
how to interact with the flash memory

I did not test booting every single of those >100 revisions, but got fed
up with this after the fifth non booting kernel or so. I didn't get why git
bisect insisted on taking me back to this range of commits - even in the
middle of two skips! - instead of just readjusting the binary search so
that that range is met later in the process. Cause then it might have not
met again at all. In the end I skipped every commit in this USB merge
manually. The ext4 readahead thing must have been introduced before that
merge and fixed somewhere after that merge. But I didn't find the comment
that might have fixed it from a quick glance.

I do not even know whether its ext4 related at all, but ext4 and readahead
has been in that backtrace.

So I just wanted to show that I am seriously working on tracking down that
likely radeon kms related freeze bug and that its time-consuming for me
due to having lots of unbootable kernels. I got another one of these with
"Destination address too large" before even InitRD seems to have done
anything. I skipped this one commit as well, and now git bisect seems to
have taken me to a good one again, lets see. At least it didn't freeze
prior up to now and I better press send now ;-). But from my bet on where
the offending commit might be, this should be a good one. I am learning a
lot on how to bisect a kernel right now ;).

--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7

Attachment: signature.asc
Description: This is a digitally signed message part.