Re: [PATCH v2 net 5/7] net/sched: taprio: fix delayed switching to new schedule after timer expiry

From: Abdul Rahim, Faizal
Date: Wed Nov 15 2023 - 06:56:29 EST




On 9/11/2023 7:50 pm, Vladimir Oltean wrote:
On Tue, Nov 07, 2023 at 06:20:21AM -0500, Faizal Rahim wrote:
If a new GCL is triggered and the new admin base time falls before the
expiry of advance_timer (current running entry from oper),
taprio_start_sched() resets the current advance_timer expiry to the
new admin base time. However, upon expiry, advance_sched() doesn't
immediately switch to the admin schedule. It continues running entries
from the old oper schedule, and only switches to the new admin schedule
much later. Ideally, if the advance_timer is shorten to align with the
new admin base time, when the timer expires, advance_sched() should
trigger switch_schedules() at the beginning.

To resolve this issue, set the cycle_time_correction to a non-initialized
value in taprio_start_sched(). advance_sched() will use it to initiate
switch_schedules() at the beginning.

Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")

Did the commit you blame really introduce this issue, or was it your
rework to trigger switch_schedules() based on the correction?


Ohh actually this issue happens even without my whole patch set.

Signed-off-by: Faizal Rahim <faizal.abdul.rahim@xxxxxxxxxxxxxxx>
---
net/sched/sch_taprio.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index f18a5fe12f0c..01b114edec30 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1379,14 +1379,19 @@ static void setup_first_end_time(struct taprio_sched *q,
}
static void taprio_start_sched(struct Qdisc *sch,
- ktime_t start, struct sched_gate_list *new)
+ ktime_t new_base_time,
+ struct sched_gate_list *new)
{
struct taprio_sched *q = qdisc_priv(sch);
- ktime_t expires;
+ struct sched_gate_list *oper = NULL;
+ ktime_t expires, start;
if (FULL_OFFLOAD_IS_ENABLED(q->flags))
return;
+ oper = rcu_dereference_protected(q->oper_sched,
+ lockdep_is_held(&q->current_entry_lock));
+
expires = hrtimer_get_expires(&q->advance_timer);
if (expires == 0)
expires = KTIME_MAX;
@@ -1395,7 +1400,17 @@ static void taprio_start_sched(struct Qdisc *sch,
* reprogram it to the earliest one, so we change the admin
* schedule to the operational one at the right time.
*/
- start = min_t(ktime_t, start, expires);
+ start = min_t(ktime_t, new_base_time, expires);
+
+ if (expires != KTIME_MAX &&
+ ktime_compare(start, new_base_time) == 0) {
+ /* Since timer was changed to align to the new admin schedule,
+ * setting the variable below to a non-initialized value will

I find the wording "setting the variable below to a non-initialized value"
confusing. 0 is non-initialized? You're talking about a value different
than INIT_CYCLE_TIME_CORRECTION. What about "setting a specific cycle
correction will indicate ..."?


Sure

+ * indicate to advance_sched() to call switch_schedules() after
+ * this timer expires.
+ */
+ oper->cycle_time_correction = 0;

Why 0 and not ktime_sub(new_base_time, oper->cycle_end_time)? Doesn't
the precise correction value make a difference?


Negative correction and its calculation is a separate problem handled in different patch.

My intention is to highlight a specific issue and address it with a single patch. The core problem stemmed from the new admin schedule not making an immediate transition in advance_sched().

I'll rework this patch to focus specifically on resolving this problem while gradually aligning with the overall series. Importantly, I won't be removing anything from this patch in the process.

Is that okay ?

+ }
hrtimer_start(&q->advance_timer, start, HRTIMER_MODE_ABS);
}
--
2.25.1