Re: [PATCH v2 1/2] perf: add container identifier entry in perf sample data

From: Hari Bathini
Date: Fri Sep 02 2016 - 09:55:59 EST




On Thursday 01 September 2016 02:39 PM, Peter Zijlstra wrote:
On Tue, Aug 30, 2016 at 09:57:02PM +0530, Hari Bathini wrote:
Currently, there is no mechanism to filter events based on containers.
perf -G can be used, but it will not filter events for the containers
created after perf is invoked, making it difficult to assess/analyze
performance issues of multiple containers at once. This limitation can
be overcome, if there is a standard kernel identifier for containers.

This patch introduces a container identifier entry field in perf sample
data to identify or distinguish sample data of different containers. It
uses the cgroup namespace inode number of a given task as it's container
identifier (cid). Alternatively, inode number of pid namespace can also
be used as cid. This patch assumes each container is created with it's
own cgroup namespace.

Hi Peter,

I'm thinking this value is mostly the same for tasks, just like COMM and

I think so, too. Namespaces aren't changed that often for tasks...

MMAP. Could we therefore not emit (sideband) events whenever a task
changes namespace and get the same information but with tons less data?

You mean, something like PERF_RECORD_NAMESPACE that
emits events on fork, clone, setns..?

That also gives the possibility of recording all namespaces, not just
the one.

True. If we record all namespaces, container identifier interpretation
can be left to the userspace to decide, which is much more flexible...

Thanks
Hari