Re: [PATCH 4/5] perf_events: add cgroup support (v7)

From: Stephane Eranian
Date: Wed Jan 05 2011 - 03:32:45 EST


Li,

Thanks for your comment. Good catch on perf_detach_cgroup().
I will resubmit this patch (part 4/5).

On Wed, Jan 5, 2011 at 4:13 AM, Li Zefan <lizf@xxxxxxxxxxxxxx> wrote:
> Stephane Eranian wrote:
>> This kernel patch adds the ability to filter monitoring based on
>> container groups (cgroups). This is for use in per-cpu mode only.
>>
>> The cgroup to monitor is passed as a file descriptor in the pid
>> argument to the syscall. The file descriptor must be opened to
>> the cgroup name in the cgroup filesystem. For instance, if the
>> cgroup name is foo and cgroupfs is mounted in /cgroup, then the
>> file descriptor is opened to /cgroup/foo. Cgroup mode is
>> activated by passing PERF_FLAG_PID_CGROUP in the flags argument
>> to the syscall.
>>
>> For instance to measure in cgroup foo on CPU1 assuming
>> cgroupfs is mounted under /cgroup:
>>
>> struct perf_event_attr attr;
>> int cgroup_fd, fd;
>>
>> cgroup_fd = open("/cgroup/foo", O_RDONLY);
>> fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP);
>> close(cgroup_fd);
>>
>> Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
>
> For the cgroup part:
>
> Acked-by: Li Zefan <lizf@xxxxxxxxxxxxxx>
>
> a few comments below:
>
>>
> ...
>> +config CGROUP_PERF
>> + Â Â bool "Enable perf_event per-cpu per-container group (cgroup) monitoring"
>> + Â Â depends on PERF_EVENTS && CGROUPS
>
> Depending on CGROUPS is already implicated by "if CGROUPS"
>
>> + Â Â help
>> + Â Â Â This option extends the per-cpu mode to restrict monitoring to
>> + Â Â Â threads which belong to the cgroup specificied and run on the
>> + Â Â Â designated cpu.
>> +
>> + Â Â Â Say N if unsure.
>> +
>> Âmenuconfig CGROUP_SCHED
>> Â Â Â bool "Group CPU scheduler"
>> Â Â Â depends on EXPERIMENTAL
> ...
>> +static inline int perf_cgroup_connect(int fd, struct perf_event *event,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â struct perf_event_attr *attr,
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â struct perf_event *group_leader)
>> +{
>> + Â Â struct perf_cgroup *cgrp;
>> + Â Â struct cgroup_subsys_state *css;
>> + Â Â struct file *file;
>> + Â Â int ret = 0, fput_needed;
>> +
>> + Â Â file = fget_light(fd, &fput_needed);
>> + Â Â if (!file)
>> + Â Â Â Â Â Â return -EBADF;
>> +
>> + Â Â css = cgroup_css_from_dir(file, perf_subsys_id);
>> + Â Â if (IS_ERR(css))
>> + Â Â Â Â Â Â return PTR_ERR(css);
>> +
>> + Â Â cgrp = container_of(css, struct perf_cgroup, css);
>> + Â Â event->cgrp = cgrp;
>> +
>> + Â Â /*
>> + Â Â Â* all events in a group must monitor
>> + Â Â Â* the same cgroup because a thread belongs
>> + Â Â Â* to only one perf cgroup at a time
>> + Â Â Â*/
>> + Â Â if (group_leader && group_leader->cgrp != cgrp) {
>> + Â Â Â Â Â Â perf_detach_cgroup(event);
>
> This doesn't seem right, because we haven't got the cgroup
> refcount.
>
>> + Â Â Â Â Â Â ret = -EINVAL;
>> + Â Â } else {
>> + Â Â Â Â Â Â /* must be done before we fput() the file */
>> + Â Â Â Â Â Â perf_get_cgroup(event);
>> + Â Â }
>> + Â Â fput_light(file, fput_needed);
>> + Â Â return ret;
>> +}
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/