Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to3.6-rc5 on AMD chipsets - bisected

From: Borislav Petkov
Date: Wed Sep 26 2012 - 13:16:52 EST


On Wed, Sep 26, 2012 at 04:23:26AM +0200, Mike Galbraith wrote:
> On Tue, 2012-09-25 at 20:42 +0200, Borislav Petkov wrote:
>
> > Right, so why did we need it all, in the first place? There has to be
> > some reason for it.
>
> Easy. Take two communicating tasks. Is an affine wakeup a good idea?
> It depends on how much execution overlap there is. Wake affine when
> there is overlap larger than cache miss cost, and you just tossed
> throughput into the bin.
>
> select_idle_sibling() was originally about shared L2, where any overlap
> was salvageable. On modern processors with no shared L2,

Oh, but we do have shared L2s in the Bulldozer uarch (a subset of the
modern AMD processors :)).

> you have to get past the cost, but the gain is still there. Intel
> wins with loads that AMD loses very bady on, so I can only guess that
> Intel must feed caches more efficiently. Dunno. It just doesn't matter
> though, point is that there is a win to be had in both cases, the
> breakeven just isn't at the same point.

Well, I guess selecting the proper core in the hierarchy depending on
the workload is one of those hard problems.

Teaching select_idle_sibling to detect the breakeven point and act
accordingly would be not that easy then...

Thanks.

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/