Re: [PATCH v3 0/2] arch_topology: Pre-allocate cacheinfo from primary CPU

From: Pierre Gondois
Date: Tue Apr 11 2023 - 08:36:27 EST


Hello Radu,
Some additional points:
1-
For the record, Will made a comment about adding weak functions
(cf. https://lore.kernel.org/all/20230327121734.GB31342@willie-the-truck/)
but I don't see how it could be done otherwise ...

2-
The patch-set needs to be rebased on top of v6.3-rc6,
otherwise there is a merge conflict.

3-
When trying the patch-set on an ACPI platform with no PPTT, it seems that
fetch_cache_info() is not called from init_cpu_topology() because
parse_acpi_topology() returns an error code. This result in a
'sleeping function called from invalid context' message. The following made
it work for me:

--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -838,7 +838,6 @@ void __init init_cpu_topology(void)
* don't use partial information.
*/
reset_cpu_topology();
- return;
}
for_each_possible_cpu(cpu) {

With 2 and 3 addressed:
Reviewed-by: Pierre Gondois <pierre.gondois@xxxxxxx>

Also maybe wait for Sudeep to have a look before sending a v4,
Regards,
Pierre


On 4/7/23 01:39, Radu Rendec wrote:
Commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU")
tries to build the cacheinfo from the primary CPU prior to secondary
CPUs boot, if the DT/ACPI description contains cache information.
However, if such information is not present, it still reverts to the old
behavior, which allocates the cacheinfo memory on each secondary CPU. On
RT kernels, this triggers a "BUG: sleeping function called from invalid
context" because the allocation is done before preemption is first
enabled on the secondary CPU.

The solution is to add cache information to DT/ACPI, but at least on
arm64 systems this can be avoided by leveraging automatic detection
(through the CLIDR_EL1 register), which is already implemented but
currently doesn't work on RT kernels for the reason described above.

This patch series attempts to enable automatic detection for RT kernels
when no DT/ACPI cache information is available, by pre-allocating
cacheinfo memory on the primary CPU.

The first patch adds an architecture independent infrastructure that
allows architecture specific code to take an early guess at the number
of cache leaves of the secodary CPUs, while it runs in preemptible
context on the primary CPU. At the same time, it gives architecture
specific code the opportunity to go back later, while it runs on the
secondary CPU, and reallocate the cacheinfo memory if the initial guess
proves to be wrong.

The second patch leverages the infrastructure implemented in the first
patch and enables early cache depth detection for arm64.

The patch series is based on an RFC patch that was posted to the
linux-arm-kernel mailing list and discussed with a smaller audience:
https://lore.kernel.org/all/20230323224242.31142-1-rrendec@xxxxxxxxxx/

Changes to v2:
* Address minor coding style issue (unbalanced braces).
* Move cacheinfo reallocation logic from detect_cache_attributes() to a
new function to improve code readability.
* Minor fix to cacheinfo reallocation logic to avoid a new detection of
the cache level if/when detect_cache_attributes() is called again.

Radu Rendec (2):
cacheinfo: Add arch specific early level initializer
cacheinfo: Add arm64 early level initializer implementation

arch/arm64/kernel/cacheinfo.c | 32 +++++++++++----
drivers/base/cacheinfo.c | 75 +++++++++++++++++++++++++----------
include/linux/cacheinfo.h | 2 +
3 files changed, 79 insertions(+), 30 deletions(-)