Re: Cache coherency... and locking

From: Peter Rival (frival@zk3.dec.com)
Date: Fri Jul 28 2000 - 09:10:04 EST


Oliver Xymoron wrote:

> On Fri, 28 Jul 2000, Peter Rival wrote:
>
> > Oliver Xymoron wrote:
> > >
> > > That's specific to MIPS. A general Linux NUMA model probably won't assume
> > > coherence.
> > >
> >
> > Why are we not going to assume coherence? What NUMA platforms do we intend
> > to support in the relevant future that are not CC-NUMA? AFAIK, the only NUMA
> > platforms we support so far are the SGI Origin 2X00 series (CC-NUMA) and the
> > Compaq AlphaServer GS* series (also CC-NUMA).
>
> Because cache coherence is both complex and expensive and treating a NUMA
> as a tight cluster rather than a single machine (as you mention) is
> probably a saner architecture.
>
> CC is primarily a big deal for user space, for apps pretending they're on
> SMP. We don't want to encourage this model as it's fundamentally a bad
> match for the hardware.
>
> As for the kernel, supporting the two models should be relatively easy
> compared to something like adding SMP support in the first place.. We are
> already paying close attention to memory-sharing assumptions for
> performance reasons and the number of inter-processor primitives is small.
>
> > I just don't want generic code to have to pay too heavy a price for future
> > support of a platform we never support. And truth be told, at that point
> > we'd be better off just running with Larry's idea of multiple
> > semi-independent operating systems and handling the lack of cache coherence
> > in a sane and well-defined way.
>
> I think this even makes sense on machines with CC hardware. The break-even
> number of processors might be larger, but it's smaller than a medium-sized
> NUMA machine.
>

Agreed that it can make sense on CC-NUMA systems. I'm just not convinced that our
optimal starting place is waiting until we have all of our "tight cluster" ducks
(penguins? ;) in a row. You can make a significant impact on performance on
CC-NUMA systems merely by acknowledging that not all memory is created equal and
enabling the vm and scheduling subsystems to act on that knowledge. You help user
space even more by opening this knowledge up to it with a simple API that lets
those that really care tweak it a bit.

Believe me, I do think that eventually the "tight cluster" architecture will be
the way to go, but I also understand that we have very, _very_ fast systems (both
ours and SGIs) that will be getting screwed until we have that architecture in
place if we don't at least try simple NUMA enablers. And really, trust me - doing
_something_ makes a _huge_ difference on NUMA hardware, and it doesn't even have
to be that big. I know, I watch it every day, and I see what happens when people
blindly assume that SMP scalability==NUMA scalability.

So what am I saying? Let's at least support CC-NUMA systems with a little
intelligence (i.e. the zoned allocator, given some better muscle-enhancing drugs)
and work full-bore on the "tight cluster" architecture. When that's ready, we
won't have to care (for the most part) anymore. But that's a long way off...

 - Pete

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:27 EST