Re: [PATCH v2 0/6] x86: Clean up fast syscall return validation

From: Linus Torvalds
Date: Sat Oct 07 2023 - 14:07:34 EST


On Sat, 7 Oct 2023 at 02:57, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> Please note that 8-bit movb instruction in fact represents insert into
> word-mode register. The compiler does not know how this word-mode
> register will be used, so to avoid partial register stalls, it takes a
> cautious approach and (with -O2) moves constant to a register with a
> word-width instruction.

Yes. In this case I really think it's our kernel code that is disgusting.

People seem to think that because cpu_feature_enabled() uses static
jumps, it's somehow "free". Not so. The static jumps are often
horrendous for code quality, and they effectively make the compiler
blind to otherwise obvious optimizations.

We need to stop using those so blindly. I see code like this

return pgtable_l5_enabled() ? PT64_ROOT_5LEVEL : PT64_ROOT_4LEVEL;

and it just makes me sad. And yes, that virtual mask is disgusting. This:

#define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled() ? 56 : 47)

is *NOT* cheap. The end result is *not* a constant, it's a two-way
branch, and while the branch is static and the compiler may then
optimize each side of the branch with the constant in question, most
of the time it just results in *disgusting* code like this where gcc
just has to load a constant into a register, and treat it as a
variable.

With the code shuffling, it's possibly *more* expensive than just
loading from memory, in that you just change a D$ load into an I$ one
instead. I'm sure that looks great on benchmarks that stay entirely in
the I$, but ...

Something like using alternatives is *much* cheaper. Better yet is the
suggestion to simplify the conditionals entirely to not depend on the
virtual shift at all (ie our recent access_ok() simplifications that
were due to _another_ CPU feature)

> Also, the compiler is quite eager to CSE constants. When there are two
> or more uses of the same constant, it will move a constant into the
> register first.

Now, honestly, that may be a good thing for other architectures that
have issues with constants, but on x86 - and integer constants - it's
likely a bad thing unless the constant is particularly complicated.

Most x86 instructions will take the usual 8/32-bit constants (but if
we have arbitrary 64-bit ones, you tend to need them in registers).

Linus