Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput -16.9% regression

From: Linus Torvalds
Date: Thu Nov 16 2023 - 11:48:47 EST


On Thu, 16 Nov 2023 at 10:44, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> Reportedly and apparently, this pretty much addresses the issue at hand.
> However, I'd still like for the compiler to handle the small length
> cases by issuing plain MOVs instead of blindly doing "call memcpy".
>
> Lemme see how it would work with your patch...

Hmm. I know about the '-mstringop-strategy' flag because of the fairly
recently discussed bug where gcc would create a byte-by-byte copy in
some crazy circumstances with the address space attributes:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111657

But I incorrectly thought that "-mstringop-strategy=libcall" would
then *always* do library calls.

So I decided to test, and that shows that gcc still ends up doing the
"expand small constant size copies inline" even with that option, and
doesn't force library calls for those cases.

IOW, my assumption was just broken, and using
"-mstringop-strategy=libcall" may well be the right thing to do.

Of course, it's also possible that with all the function call overhead
introduced by the CPU mitigations on older CPU's, we should just say
"rep movsb" is always correct - if you have a new CPU with FSRM it's
good, and if you have an old CPU it's no worse than the horrendous CPU
mitigation overhead for function call/returns.

I really hate the mitigations. Oh well.

Ayway, maybe your patch is the RightThing(tm). Or maybe we should use
'rep_byte' instead of 'libcall'. Who knows..

Linus