Re: efficient access to "rotational"; new fcntl?

From: Arjan van de Ven
Date: Sat Sep 19 2009 - 07:30:20 EST


On Sat, 19 Sep 2009 14:11:38 +0300
Avi Kivity <avi@xxxxxxxxxx> wrote:

> On 09/19/2009 12:19 PM, Arjan van de Ven wrote:
> >> However, sort *would* benefit, and some UCLA students implemented
> >> that for a term project. Unfortunately, the project is stalled
> >> because the implementation was not efficient enough, and no one
> >> has found the time to improve it since.
> >>
> > parallel sort... call me skeptical. My gut feeling is that you'll
> > get killed by communication overhead.
> > (sort seems to be more communication than raw cpu use)
> >
> >
>
> Why? a sort that fits in memory is purely cpu and memory access.

memory access == communication due to cache line bounces....

>
> Instead of O(N log N) you'd get K * O(N/K log N/K) followed by an
> O(N) merge. For large N and small K, you get a speedup of roughly K
> (since the O(N) merge is dominated by the preceding sort.

doing big-O arithmatic and then add constant divisions/multipliers is
rather.. dangerous ;)

You actually get K * C1 * O(N/K log N/K) + C2 * O(N)
where C1/C2 is the actual cost of the extra intra-cpu communication.
(and for most sorting, I suspect the communication costs dominate over
CPU cost)

I'd be happy to be proven wrong...


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/