Re: [PATCH v2 07/12] clocksource: mips-gic-timer: Always use cluster 0 counter as clocksource

From: Marc Zyngier
Date: Mon Jun 27 2022 - 10:27:54 EST


On 2022-06-27 15:17, Dragan Mladjenovic wrote:
On 25-May-22 14:10, Dragan Mladjenovic wrote:
From: Paul Burton <paulburton@xxxxxxxxxx>

In a multi-cluster MIPS system we have multiple GICs - one in each
cluster - each of which has its own independent counter. The counters in
each GIC are not synchronised in any way, so they can drift relative to
one another through the lifetime of the system. This is problematic for
a clocksource which ought to be global.

Avoid problems by always accessing cluster 0's counter, using
cross-cluster register access. This adds overhead so we only do so on
systems where we actually have CPUs present in multiple clusters.
For now, be extra conservative and don't use gic counter for vdso or
sched_clock in this case.

Signed-off-by: Paul Burton <paulburton@xxxxxxxxxx>
Signed-off-by: Chao-ying Fu <cfu@xxxxxxxxxxxx>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@xxxxxxxxxx>

diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c
index be4175f415ba..6632d314a2c0 100644
--- a/drivers/clocksource/mips-gic-timer.c
+++ b/drivers/clocksource/mips-gic-timer.c
@@ -170,6 +170,37 @@ static u64 gic_hpt_read(struct clocksource *cs)
return gic_read_count();
}
+static u64 gic_hpt_read_multicluster(struct clocksource *cs)
+{
+ unsigned int hi, hi2, lo;
+ u64 count;
+
+ mips_cm_lock_other(0, 0, 0, CM_GCR_Cx_OTHER_BLOCK_GLOBAL);
+
+ if (mips_cm_is64) {
+ count = read_gic_redir_counter();
+ goto out;
+ }
+
+ hi = read_gic_redir_counter_32h();
+ while (true) {
+ lo = read_gic_redir_counter_32l();
+
+ /* If hi didn't change then lo didn't wrap & we're done */
+ hi2 = read_gic_redir_counter_32h();
+ if (hi2 == hi)
+ break;
+
+ /* Otherwise, repeat with the latest hi value */
+ hi = hi2;
+ }
+
+ count = (((u64)hi) << 32) + lo;
+out:
+ mips_cm_unlock_other();
+ return count;
+}
+
static struct clocksource gic_clocksource = {
.name = "GIC",
.read = gic_hpt_read,
@@ -204,6 +235,11 @@ static int __init __gic_clocksource_init(void)
/* Calculate a somewhat reasonable rating value. */
gic_clocksource.rating = 200 + gic_frequency / 10000000;
+ if (mips_cps_multicluster_cpus()) {
+ gic_clocksource.read = &gic_hpt_read_multicluster;
+ gic_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_NONE;
+ }
+
ret = clocksource_register_hz(&gic_clocksource, gic_frequency);
if (ret < 0)
pr_warn("Unable to register clocksource\n");
@@ -262,7 +298,8 @@ static int __init gic_clocksource_of_init(struct device_node *node)
* stable CPU frequency or on the platforms with CM3 and CPU frequency
* change performed by the CPC core clocks divider.
*/
- if (mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) {
+ if ((mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) &&
+ !mips_cps_multicluster_cpus()) {
sched_clock_register(mips_cm_is64 ?
gic_read_count_64 : gic_read_count_2x32,
64, gic_frequency);

Hi,

I was expecting some comments on this, but I'll ask first. We now
taking a conservative approach of not using gic as sched_clock in
multicluster case. Is this necessary or can sched_clock tolerate a
fixed delta between clocks on different cpu clusters?

I don't think that's wise. We generally go into all sort of
troubles to keep sched_clock() strictly identical between CPUs,
and there are tons of things that rely on this (the scheduler
itself, but any sort of tracing...). You just have to grep
for the various use cases.

A consequence of the above is that the kernel can (and will)
snapshot a sched_clock value, and compare it to the value on
the current CPU. Imagine what happens if the difference is
negative...

So I don't know what the deal is with the MIPS GIC, but if any
of the above can happen, you're doomed.

M.
--
Jazz is not dead. It just smells funny...