[RFC PATCH] arm64/arch_timer: register arch counter early

From: Eric Chanudet
Date: Mon Aug 28 2023 - 17:13:09 EST


I was looking at how long it takes to get from primary_entry to
time_init(), when sched_clock is initialized and timestamps are
available to printks. Patching in cntvct_el0 reads in a recent
linux-next (next-20230821) with the default arm64 configuration, I found
the following:
* Qualcomm RideSX4 (sa8775p-ride, 36GB): ~700ms
* Ampere Altra MtSnow (96GB): ~1210ms

Narrowing it down a bit, most of this time is spent in:
start_kernel()
setup_arch()
paging_init()
map_mem()
// Mainly in for_each_mem_range(i, &start, &end)

>From time_init(), each platform reports starting the init process after:
* Qualcomm RideSX4 (sa8775p-ride, 36GB): ~1100ms
* Ampere Altra MtSnow (96GB): ~600ms
So the timestamps are not accounting a relatively significant slice of
time spent initializing the kernel.

I found a recent similar thread[1], but I would rather account for time
spent solely in the kernel while using the arch counter.

IIUC arm64 can rely on having its arch counter, is it possible, and
sane, to attempt to register to sched_clock earlier in setup_arch()? It
would look similar to what is done for SPARC64[2]?

The following patch tries to experiment with this, but let the counter
re-register as sched_clock (through time_init->timer_probe) and does not
handle the erratas and other relevant situations:
- erratas and work-arounds in arm_arch_timer,
- cntvct vs cntpct (that shouldn't make a difference at this stage?),
- device-tree overrides for the frequency (?).

Alternatively, would it make more sense to capture a counter read early
on, for example close to primary_entry after jumping in the kernel, and
use it as epoch for sched_clock_register()? Since this is only happening
early it should not have time to overflow?

[1] https://lore.kernel.org/linux-arm-kernel/CAKZGPAOYPp3ANWfBWxcsT3TJdPt8jH-f2ZJzpin=UZ=-b_-QFg@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/all/1497300108-158125-7-git-send-email-pasha.tatashin@xxxxxxxxxx/

Signed-off-by: Eric Chanudet <echanude@xxxxxxxxxx>
---
arch/arm64/kernel/setup.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 417a8a86b2db..cbc51c42c9fd 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -32,6 +32,7 @@
#include <linux/sched/task.h>
#include <linux/scs.h>
#include <linux/mm.h>
+#include <linux/sched_clock.h>

#include <asm/acpi.h>
#include <asm/fixmap.h>
@@ -53,6 +54,9 @@
#include <asm/efi.h>
#include <asm/xen/hypervisor.h>
#include <asm/mmu_context.h>
+#include <asm/arch_timer.h>
+
+#include <clocksource/arm_arch_timer.h>

static int num_standard_resources;
static struct resource *standard_resources;
@@ -290,8 +294,23 @@ u64 cpu_logical_map(unsigned int cpu)
return __cpu_logical_map[cpu];
}

+static void __init early_sched_clock(void)
+{
+ u64 min_cycles;
+ u64 min_rollover_secs = 40ULL * 365 * 24 * 3600;
+ u32 rate;
+ int width;
+
+ rate = arch_timer_get_cntfrq();
+ min_cycles = min_rollover_secs * rate;
+ width = clamp_val(ilog2(min_cycles - 1) + 1, 56, 64);
+ sched_clock_register(__arch_counter_get_cntvct, width, rate);
+}
+
void __init __no_sanitize_address setup_arch(char **cmdline_p)
{
+ early_sched_clock();
+
setup_initial_init_mm(_stext, _etext, _edata, _end);

*cmdline_p = boot_command_line;
--
2.41.0