Re: aim7 scalability issue on 4 socket machine

From: Lee Schermerhorn
Date: Fri Sep 18 2009 - 09:15:29 EST

Next message: Tilman Schmidt: "Re: [stable] [patch 00/48] 2.6.27.32-stable review"
Previous message: Jan Beulich: "amd64_edac making improper assumptions?"
In reply to: Hugh Dickins: "Re: aim7 scalability issue on 4 socket machine"
Next in thread: Andi Kleen: "Re: aim7 scalability issue on 4 socket machine"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 2009-09-18 at 09:12 +0200, Andi Kleen wrote:
> On Fri, Sep 18, 2009 at 07:53:58AM +0100, Hugh Dickins wrote:
> > On Thu, 17 Sep 2009, Andrew Morton wrote:
> > > On Fri, 18 Sep 2009 10:02:19 +0800 "Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > So, Yanmin, please retest with http://lkml.org/lkml/2009/9/13/25
> > > > > and let us know if that works as well for you - thanks.
> > > > I tested Lee's patch and it does fix the issue.
> >
> > Thanks for checking and reporting back, Yanmin.
> >
> > >
> > > Do we think we should cook up something for -stable?
> >
> > Gosh, I laughed at Lee (sorry!) for suggesting it for -stable:
> > is stable really for getting a better number out of a benchmark?
>
> When your system is large enough scalability problems (e.g.
> lock contention) can be a serious bug. i.e. when your workload
> is 150% slower than expected that can well be a show stopper.
>
> Admittedly the workload in this case was a benchmark, but it's
> not that far fetched to expect the same problem in a real application.
>
> We had a similar problem with the accounting lock some time
> ago, I think that patch also went in.
>
> So yes I think simple non intrusive fixes for serious scalability
> problems should be stable candidates.
>
> > > Either this is a regression or the workload is particularly obscure.
> >
> > I've not cross-checked descriptions, but assume Lee was actually
> > testing on exactly the same kind of upcoming Nehalem as Yanmin, and
> > that machine happens to have characteristics which show up badly here.
>
> AFAIK Lee usually tests on large IA64 boxes.

In this case, it's an x86_64 [DL785] platform--an 8 socket, 4 core
Shanghai in a glueless, "twisted ladder" config.

Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Tilman Schmidt: "Re: [stable] [patch 00/48] 2.6.27.32-stable review"
Previous message: Jan Beulich: "amd64_edac making improper assumptions?"
In reply to: Hugh Dickins: "Re: aim7 scalability issue on 4 socket machine"
Next in thread: Andi Kleen: "Re: aim7 scalability issue on 4 socket machine"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]