Re: [PATCH 0/3] enable memcg accounting for kernfs objects

From: Vasily Averin
Date: Tue Aug 16 2022 - 05:29:21 EST


On 8/11/22 06:19, Vasily Averin wrote:
> On 8/9/22 20:56, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Aug 09, 2022 at 07:49:34PM +0200, Michal Koutný wrote:
>>> On Tue, Aug 09, 2022 at 07:31:31AM -1000, Tejun Heo <tj@xxxxxxxxxx> wrote:
>>>> I'm not quite sure whether following the usual "charge it to the allocating
>>>> task's cgroup" is the best way to go about it. I wonder whether it'd be
>>>> better to attach it to the new cgroup's nearest ancestor with memcg enabled.
>>>
>>> See also
>>> https://lore.kernel.org/r/YnBLge4ZQNbbxufc@blackbook/
>>> and
>>> https://lore.kernel.org/r/20220511163439.GD24172@xxxxxxxxxxxxxxxxx/
>>
>> Ah, thanks. Vasily, can you please include some summary of the discussions
>> and the rationales for the path taken in the commit message?
>
> Dear Tejun,
> thank you for the feedback, I'll do it in next patch set iteration.
>
> However, I noticed another problem in neighborhood and I planned to
> add new patches into current patch set. One of the new patches is quite simple,
> however second one is quite complex and requires some discussion.

Summing up a private discussion with Tejun, Michal and Roman:
I'm going to create few new patches:

1) adjust active memcg for objects allocated during creation of new cgroup
This patch will take memcg from parent cgroup an use it for accounting
all objects allocated during creation of new cgroup.
For that it moves set_active_memcg() calls from mem_cgroup_css_alloc()
to cgroup_mkdir() and creates missing infrastructure.
This allows you to predict which memcg should be used for object accounting,
and should simplify debugging of possible problems and corner cases.

2) memcg: enable kernfs accounting: nodes and iattr
Already discussed and approved patches.
These objects consumes significant part of memory in various scenarios,
including new cgroup creation and new net device creation.

3) adjust active memcg for simple_xattr accounting
sysfs and tmpfs are in-memory file system,
for extended attributes they uses simple_xattr infrastructure.
The patch forces sys_set[f]xattr calls to account xattr object
to predictable memcg: for kernfs memcg will be taken from kernfs node,
for shmem -- from shmem_info.
Like 1) case, this allows to understand which memcg should be used
for object accounting and simplify debugging of possible troubles.

4) memcg: enable accounting for simple_xattr: names and values
This patch enables accounting for objects described in previous patch

5) simple_xattrs: replace list to rb-tree
This significantly reduces the search time for existing entries.

Additionally Roman Gushchin prepares patch
"`put`ting the kernfs_node reference earlier in the cgroup removal process"

Thank you,
Vasily Averin