[RFC][PATCH 00/34] [RFC] x86: Rework system-wide configuration masquerading as per-cpu data

From: Dave Hansen
Date: Thu Feb 22 2024 - 13:42:49 EST


tl;dr: This tries to be more regimented in how system-wide x86 processor
configuration data is initialized. It does that by moving fields out of
the per-cpu 'struct cpuinfo_x86' and into a new structure. It also
requires that the boot CPU set these data up *once* and then be left
alone.

This is a more comprehensive approach to avoid the random tinkering
in patches like these:

https://lore.kernel.org/all/20240131230902.1867092-1-pbonzini@xxxxxxxxxx/

--

'struct cpuinfo_x86' is a mess. At a high level, it can be thought
of as a CPU enumeration cache that (among other things) avoids using
CPUID constantly. There's a copy of it for each CPU, but often the
'boot_cpu_data' version is the only one that counts. Its fields are
written in basically random order, including a memcpy() into the
secondary processor copies from 'boot_cpu_data'.

This series focuses on a subset of the 'cpuinfo_x86' fields that must
be consistent across all CPUs:
* The virtual and physical sizes supported by the CPU
* The number of those physical bits that the caches comprehend
* The size of a cacheline that CLFLUSH works on
* The actual alignment that matters to the caches (can be different
from that CLFLUSH size)

There are a few random folks cc'd here because their subsystem reads
one or more of these things in an x86 #ifdef of some kind.

There's a mishmash of ways to obtain these from CPUID, plus fixups
for erratum and general processor wonkiness. There are also defaults
for these for processors that, for instance, don't even _have_ CPUID
support. I think virt/phys_bits can be set and overwritten no fewer
than five times!

Let's bring some order to the chaos.

First, create some abstractions so that there are no longer direct
explicit accesses to some 'boot_cpu_data' fields. This also provides
a way to do sanity checking so nothing consumes garbage.

Second, separate out the address configuration inputs from the
resulting values. The inputs are provided in early boot by the boot
processor and stashed in x86_addr_config.

Third, remove global, shared configuration from 'struct cpuinfo_x86'.
Put it in a new structure: 'struct x86_sys_config'.

This creates a simple set of rules:
1. The boot CPU populates 'bsp_addr_config' and nothing ever writes
to it again
2. get_cpu_address_sizes() reads 'bsp_addr_config' and writes
'x86_config'
3. 'bsp_addr_config' is never read again
4. 'x86_config' is never written again

The centralized helpers also now provide a chance of enforcing those
rules. Finish up the series by enforcing those rules and spitting out
a warning when they are violated. This warning mechanism works in
*very* early boot and is a little unusual. It could use some more
eyeballs (Subject: Enforce read-only x86_config state).

=== FAQ ===

== Why are both 'bsp_addr_config' and 'x86_config' needed? ==

Having defined, separate lifetimes helps enforce the new rules.

Once everything is in place 'bsp_addr_config' is only used at boot and
can be marked __init. Mostpost can help find users of it that leak to
after boot. Secondary CPU startup code can't be __init (because of
CPU hotplug) this helps ensure that 'bsp_addr_config' can only be
referenced from boot CPU code.

'x86_config' can also be __ro_after_init, which helps mitigate the
chance of it becoming tweaked after boot and growing into a little
mini version of 'boot_cpu_data' with all the same problems.

--

Ideas for future work (probably in another series):

* Consolidate defaults in x86_config accessors and
get_cpu_address_sizes() down to one copy
* Can we do something more aggressive with x86_clflush_size()? Is
it ever anything other than 64 bytes on 64-bit?

Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
Cc: Len Brown <lenb@xxxxxxxxxx>
Cc: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Cc: Mike Snitzer <snitzer@xxxxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Juergen Gross <jgross@xxxxxxxx>