Re: [PATCH] mm/shmem: set default tmpfs size according to memcg limit

From: Yafang Shao
Date: Mon Nov 20 2017 - 08:05:12 EST


2017-11-20 20:39 GMT+08:00 Michal Hocko <mhocko@xxxxxxxxxx>:
> On Mon 20-11-17 20:16:15, Yafang Shao wrote:
>> 2017-11-20 20:04 GMT+08:00 Michal Hocko <mhocko@xxxxxxxxxx>:
>> > On Fri 17-11-17 09:49:54, Shakeel Butt wrote:
>> >> On Fri, Nov 17, 2017 at 9:41 AM, Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
>> > [...]
>> >> > Of couse that is the best way.
>> >> > But we can not ensue all applications will do it.
>> >> > That's why I introduce a proper defalut value for them.
>> >> >
>> >>
>> >> I think we disagree on the how to get proper default value. Unless you
>> >> can restrict that all the memory allocated for a tmpfs mount will be
>> >> charged to a specific memcg, you should not just pick limit of the
>> >> memcg of the process mounting the tmpfs to set the default of tmpfs
>> >> mount. If you can restrict tmpfs charging to a specific memcg then the
>> >> limit of that memcg should be used to set the default of the tmpfs
>> >> mount. However this feature is not present in the upstream kernel at
>> >> the moment (We have this feature in our local kernel and I am planning
>> >> to upstream that).
>> >
>> > I think the whole problem is that containers pretend to be independent
>> > while they share a non-reclaimable resource. Fix this and you will not
>> > have a problem. I am afraid that the only real fix is to make tmpfs
>> > private per container instance and that is something you can easily
>> > achieve in the userspace.
>> >
>>
>> Agree with you.
>
> I suspect you misunderstood...
>
>> Introduce tmpfs stat in memory cgroup, something like
>> memory.tmpfs.limit
>> memory.tmpfs.usage
>>
>> IMHO this is the best solution.
>
> No, you misunderstood. I do not think that we want to split tmpfs out of
> the regular limit. We used to have something like that for user vs.
> kernel memory accounting in v1 and that turned to be not working well.
>

understood.
That really doesn't work well.

> What you really want to do is to make a private mount per container to
> ensure that the resource is really _yours_
> --

That is what I'm doing it currently.
Then setting the default size depends on the container memory limit works well.

Thanks
Yafang