Re: [PATCH][RESEND] Export per-tid and per-tgid cputime in nanoseconds.

From: Divyesh Shah
Date: Mon Mar 22 2010 - 14:57:42 EST


On Mon, Mar 1, 2010 at 10:05 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, 2010-03-01 at 07:44 -0800, Divyesh Shah wrote:
>> On Sat, Feb 27, 2010 at 3:43 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> >
>> > On Fri, 2010-02-26 at 18:03 -0800, Divyesh Shah wrote:
>> > > This can be used by applications to get finer granularity cputime usage on
>> > > platforms that use timestamp counters or HPET.
>> >
>> > I guess the patch looks good, I'm just not sure what HPET got to do with
>> > anything.. the scheduler certainly doesn't use HPET for timekeeping, its
>> > terribly slow to read.
>>
>> Yes you're right. Please ignore the HPET comment.
>>
>> >
>> > Also, it would be good to get some more justification than 'some
>> > applications can use this', which is basically a truism for any patch
>> > that adds a user interface.
>>
>> 1) This should be useful for a shared or virtualized environment where
>> the shared management infrastructure needs to charge each application
>> as accurately as possible for cpu usage.
>> 2) An application that services different users and does some
>> cpu-intensive work may want to measure cpu time spent for each user by
>> the serving thread.
>>
>> I think applications like web servers, remote database servers, etc.
>> fit into the second category.
>>
>> For units of work smaller than a jiffy, this really helps as some
>> threads could potentially hide from the jiffy based accounting.
>>
>> Please let me know if you want me to send the patch again with the
>> corrected description and added justification.
>
>
> Its all should and may, anything concrete?

Both of the use cases above are what we've been using (the shared
environment case) and plan on using this interface for at Google. So
they are indeed concrete use-cases. It should be useful for similar
scenarios to other users in the linux community.

>
> Also, doesn't muck like schedstat, task_delay_accounting, cpuacct-cgroup
> or some other stuff expose this number already?

I took a look at /proc/<pid>/schedstat. It does export this number
along with some other info but it doesn't do any aggregation of
threads when the pid is a thread group leader and also is less
accurate than cputime_ns as it does not include the cputime since last
tick. I have a patch that adds both of these but it I am not sure if
this affects any userspace tools which depend on the semantics of the
existing interface. Please advise whether it makes sense to change the
shcedstat interface or add this one.

>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/