One of the same group's papers did some work with this, too. Their
only interesting result was, IIRC, that the synchronization was more
important than the cache. They found that a data structure that was
owned by each CPU had a slight edge. The competition for a single run
queue *did* impose a noticable barrier.
> I get the general effect of 2 and 3 and a little bit of 4 on
> sparc64/SMP with this tiny hack I put into the idler loop:
I am a bit sheepish to say it, but can you show any rigor to your
results? I ask because my understanding with this level of
optimization is that very little of it has shown to be always
true. Indeed, intuition does not always grok silicon.
> if (current->need_resched != 0 ||
> ((p = init_task.next_run) != NULL &&
> (p->processor == smp_processor_id() ||
> (p->tss.flags & SPARC_FLAG_NEWCHILD) != 0)))
> schedule();
> It's a nice cheap heuristic. And actually I added it so that idlers
> didn't bang into the scheduler (grabbing locks, blowing dirty cache
> lines across the bus ping pong style, etc.) over and over when no new
> work was available. This nasty behavior is how ix86 SMP behaves
> currently until something equivalent is added to it's idler loop.
Hmm.
> Later,
> David S. Miller
> davem@dm.cobaltmicro.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/