Re: Analyzed/Solved/Bisected: Booting 2.6.30-rc2-git7 very slow

From: Matthew Wilcox
Date: Wed May 27 2009 - 07:22:04 EST


On Tue, May 26, 2009 at 11:31:02PM -0700, Andrew Morton wrote:
> On Wed, 20 May 2009 03:22:28 -0700 (PDT) Martin Knoblauch <knobi@xxxxxxxxxxxx> wrote:
>
> >
> > ----- Original Message ----
> >
> > > From: Mike Galbraith <efault@xxxxxx>
> > > To: Martin Knoblauch <knobi@xxxxxxxxxxxx>
> > > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; viro@xxxxxxxxxxxxxxxxxx; rjw@xxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; tigran@xxxxxxxxxxxxxxxxxxxx
> > > Sent: Wednesday, May 6, 2009 10:37:45 AM
> > > Subject: Re: Analyzed/Solved: Booting 2.6.30-rc2-git7 very slow
> > >
> > > On Wed, 2009-05-06 at 00:55 -0700, Martin Knoblauch wrote:
> > >
> > > > just to bring this back to my problem :-)
> > >
> > > Good idea :-)
> > >
> > > > Last week I reported that the "new" sysfs entry in /proc/mounts already comes
> > > out of initrd. Does this ring a bell?
> > > >
> > > > http://lkml.indiana.edu/hypermail/linux/kernel/0904.3/03048.html
> > >
> > > Nope, no bells.
> > >
> > > The only thing I can suggest is that you try a bisection.
> > >
> > > -Mike
> >
> > OK, so I finally managed to bisect the issue down to the following commit. Not much that I can say about it. Someone else suggested that it might all be a question of timing. Might very well be. I will try it out on a system with a different SCSI/RAID controller. The failing system has an "Smart Array 6i" (cciss). "cciss", "ext3" and "jbd" are all modules coming from initrd.
> >
> > |commit 1120f8b8169fb2cb51219d326892d963e762edb6
> > |Author: Stephen Hemminger <shemminger@xxxxxxxxxx>
> > |Date: Thu Dec 18 09:17:16 2008 -0800
> > |
> > | PCI: handle long delays in VPD access
> > |
> > | Accessing the VPD area can take a long time. The existing
> > | VPD access code fails consistently on my hardware. There are comments
> > |
> > | Change the access routines to:
> > | * use a mutex rather than spinning with IRQ's disabled and lock held
> > | * have a much longer timeout
> > | * call cond_resched while spinning
> > |
> > | Signed-off-by: Stephen Hemminger <shemminger@xxxxxxxxxx>
> > | Reviewed-by: Matthew Wilcox <willy@xxxxxxxxxxxxxxx>
> > | Signed-off-by: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
> >
>
> <hello, any maintainers out there?>

This is the first I've seen of this report ...

> So afacit what's happening is that the above change caused one of your
> PCI devices to take a very long time to initialise, yes? Was it the
> CCISS driver?
>
> If you add "printk.time=y" to the kernel boot command line then you'll
> get timestamped boot messages which will make it easier to determine
> where the time was consumed. Adding `initcall_debug' to the boot line
> will help us delve further into the delay, assuming that the offending
> driver is build into vmlinux (which it might not be).

The two message logs posted show NTP starting up within a second of
each other. What was the problem again?

--
Matthew Wilcox Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/