Re: [PATCH] arm64: clear_page: use stnp non-temporal instruction for performance optimizing

From: Catalin Marinas
Date: Tue Nov 16 2021 - 13:17:20 EST


On Tue, Nov 16, 2021 at 11:08:14PM +0800, Guanghui Feng wrote:
> When clear page mem, there is no need to alloc cache for storing these
> mem value.

I theory, DC ZVA is supposed to trigger write streaming mode and all
writes go directly to memory avoiding cache allocation.

> And the copy_page.S have used stnp instruction for optimizing.
> So I rewrite the clear_page.S with stnp. At the same time, I have tested it
> with stnp instruction which will get about twice the performance improvement.

On which CPU implementation? Is the same improvement seen on a wider
range of CPUs?

--
Catalin