Re: [patch v2 4/4] taskstats: Export "cdata_wait" CPU times withtaskstats

From: Michael Holzheu
Date: Tue Dec 07 2010 - 05:45:20 EST


On Wed, 2010-12-01 at 19:51 +0100, Oleg Nesterov wrote:
> On 11/29, Michael Holzheu wrote:
> >
> > * I left the struct signal locking in the patch, because I think
> > that in order to get consistent data this is necessary.
>
> OK, I guess you mean that we want to read utime/stime "atomically",
> and thus we need ->siglock to prevent the race with __exit_signal().

Yes.

But I changed my mind. If I understand the code right, currently no
locking is done for reading the tasks statistics in both the taskstats
and also in the procfs interface. So e.g. it is not guaranteed to get a
consistent set of data when reading /proc/<pid>/stat, because of the
race with account_user/system/whatever_time() and other accounting
functions.

I assume that has been made by intention. Or am I missing something?

Because you said that since 2.6.35 accessing the signal_struct is save,
I think now also for cdata it is not necessary to lock anything. As a
result of this and the discussion regarding the group leader and cdata I
would propose the following patch:
---
Subject: taskstats: Export "cdata_wait" CPU times with taskstats

From: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>

Version 3
---------
* Remove signal struct locking because accessing signal_struct is save
since 2.6.35.
* Report cdata for all tasks not only for the thread group leader.
This is then the same behavior as for /proc/<tgid>/tasks/<tid>/stat.
* Add tgid to taskstats to allow userspace to group threads.

Version 2
---------
* Use thread_group_leader() instead of (tsk->pid == tsk->tgid)
* Use cdata_wait instead of cdata_acct
* I left the struct signal locking in the patch, because I think
that in order to get consistent data this is necessary. See also
do_task_stat() in fs/proc/array.c. One problem is that we
report wrong values (zero) for cdata, if lock_task_sighand()
fails. This will be the same behavior as for /proc/<pid>/stat.
But maybe we should somehow return to userspace that the
information is not correct. Any ideas?

Description
-----------
With this patch the cumulative CPU time and the tgid (thread group ID)
is added to "struct taskstats".

Signed-off-by: Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>
---
include/linux/taskstats.h | 10 +++++++++-
kernel/tsacct.c | 3 +++
2 files changed, 12 insertions(+), 1 deletion(-)

--- a/include/linux/taskstats.h
+++ b/include/linux/taskstats.h
@@ -33,7 +33,7 @@
*/


-#define TASKSTATS_VERSION 7
+#define TASKSTATS_VERSION 8
#define TS_COMM_LEN 32 /* should be >= TASK_COMM_LEN
* in linux/sched.h */

@@ -163,6 +163,14 @@ struct taskstats {
/* Delay waiting for memory reclaim */
__u64 freepages_count;
__u64 freepages_delay_total;
+ /* version 7 ends here */
+
+ __u32 ac_tgid; /* Thread group ID */
+
+ /* Cumulative CPU time of dead children */
+ __u64 ac_cutime; /* User CPU time [usec] */
+ __u64 ac_cstime; /* System CPU time [usec] */
+ /* version 8 ends here */
};


--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -71,6 +71,9 @@ void bacct_add_tsk(struct taskstats *sta
stats->ac_majflt = tsk->maj_flt;

strncpy(stats->ac_comm, tsk->comm, sizeof(stats->ac_comm));
+ stats->ac_cutime = cputime_to_usecs(tsk->signal->cdata_wait.utime);
+ stats->ac_cstime = cputime_to_usecs(tsk->signal->cdata_wait.stime);
+ stats->ac_tgid = tsk->tgid;
}




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/