[RFC PATCH 2/2] tracing: add sched_set_prio tracepoint

From: Julien Desfossez
Date: Fri May 27 2016 - 11:26:54 EST


This tracepoint allows to keep track of all priority changes made by all
sites that can change this value. The impacted system calls are
sched_setscheduler, sched_setattr, sched_process_fork and set_user_nice.
The priority inheritance mechanism from rt_mutex gets also instrumented
with this tracepoint even though there is a dedicated tracepoint for it
(sched_pi_setprio).

This allows analysis of real-time scheduling delays per thread priority,
which cannot be performed accurately if we only trace the priority of
the currently scheduled processes.

Here is an example of what is output by ftrace when we change the
priority of a running process:
sys_sched_setscheduler(pid: 1c52, policy: 2, param: 7ffc22e20980)
sched_set_prio: comm=burnP6 pid=7250 oldprio=120 newprio=39
sys_sched_setscheduler -> 0x0
sched_switch: prev_comm=chrt prev_pid=7268 prev_prio=120
prev_state=R ==> next_comm=burnP6 next_pid=7250
next_prio=39

Signed-off-by: Julien Desfossez <jdesfossez@xxxxxxxxxxxx>
---
include/trace/events/sched.h | 21 ++++++++++++++++-----
kernel/sched/core.c | 1 +
2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 9b90c57..3b83ddb 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -407,11 +407,7 @@ DEFINE_EVENT(sched_stat_runtime, sched_stat_runtime,
TP_PROTO(struct task_struct *tsk, u64 runtime, u64 vruntime),
TP_ARGS(tsk, runtime, vruntime));

-/*
- * Tracepoint for showing priority inheritance modifying a tasks
- * priority.
- */
-TRACE_EVENT(sched_pi_setprio,
+DECLARE_EVENT_CLASS(sched_prio_template,

TP_PROTO(struct task_struct *tsk, int newprio),

@@ -436,6 +432,21 @@ TRACE_EVENT(sched_pi_setprio,
__entry->oldprio, __entry->newprio)
);

+/*
+ * Tracepoint for showing priority inheritance modifying a tasks
+ * priority.
+ */
+DEFINE_EVENT(sched_prio_template, sched_pi_setprio,
+ TP_PROTO(struct task_struct *tsk, int newprio),
+ TP_ARGS(tsk, newprio));
+
+/*
+ * Tracepoint for priority changes of a task.
+ */
+DEFINE_EVENT(sched_prio_template, sched_set_prio,
+ TP_PROTO(struct task_struct *tsk, int newprio),
+ TP_ARGS(tsk, newprio));
+
#ifdef CONFIG_DETECT_HUNG_TASK
TRACE_EVENT(sched_process_hang,
TP_PROTO(struct task_struct *tsk),
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6946b8f..45fbaab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2232,6 +2232,7 @@ int sysctl_schedstats(struct ctl_table *table, int write,

static void sched_set_prio(struct task_struct *p, int prio)
{
+ trace_sched_set_prio(p, prio);
p->prio = prio;
}

--
1.9.1