Re: [PATCH] Use x86 SSE instructions for clear_page, copy_page

From: Andrey Panin
Date: Tue Aug 17 2004 - 03:13:33 EST


On 230, 08 17, 2004 at 09:27:51AM +0200, Arjan van de Ven wrote:
> On Tue, 2004-08-17 at 08:13, Jens Maurer wrote:
> > The attached patch (against kernel 2.6.8.1) enables using SSE
> > instructions for copy_page and clear_page.
> >
> > A user-space test on my Pentium III 850 MHz shows a 3x speedup for
> > clear_page (compared to the default "rep stosl"), and a 50% speedup
> > for copy_page (compared to the default "rep movsl"). For a Pentium-4,
> > the speedup is about 50% in both the clear_page and copy_page cases.
>
>
> we used to have code like this in 2.4 but it got removed: the non
> temperal store code is faster in a microbenchmark but has the
> fundamental problem that it evics the data from the cpu cache; the
> actual USE of the data thus is a LOT more expensive, result is that the
> overall system performance goes down ;(

Did SSE clear_page() suffered from this issue too ?

--
Andrey Panin | Linux and UNIX system administrator
pazke@xxxxxxxxx | PGP key: wwwkeys.pgp.net

Attachment: signature.asc
Description: Digital signature