Re: [PATCHv2 0/2] remap_file_pages() decommission

From: Konstantin Khlebnikov
Date: Mon May 12 2014 - 10:59:21 EST


On Mon, May 12, 2014 at 4:43 PM, Kirill A. Shutemov
<kirill@xxxxxxxxxxxxx> wrote:
> On Fri, May 09, 2014 at 08:14:08AM -0700, Linus Torvalds wrote:
>> On Fri, May 9, 2014 at 7:05 AM, Kirill A. Shutemov
>> <kirill.shutemov@xxxxxxxxxxxxxxx> wrote:
>> >
>> > Hm. I'm confused here. Do we have any limit forced per-user?
>>
>> Sure we do. See "struct user_struct". We limit max number of
>> processes, open files, signals etc.
>>
>> > I only see things like rlimits which are copied from parrent.
>> > Is it what you want?
>>
>> No, rlimits are per process (although in some cases what they limit
>> are counted per user despite the _limits_ of those resources then
>> being settable per thread).
>>
>> So I was just thinking that if we raise the per-mm default limits,
>> maybe we should add a global per-user limit to make it harder for a
>> user to use tons and toms of vma's.
>
> Here's the first attempt.
>
> I'm not completely happy about current_user(). It means we rely on that
> user of mm owner task is always equal to user of current. Not sure if it's
> always the case.
>
> Other option is to make MM_OWNER is always on and lookup proper user
> through task_cred_xxx(rcu_dereference(mm->owner), user).
>
> From 5ee6f6dd721ada8eb66c84a91003ac1e3eb2970a Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> Date: Mon, 12 May 2014 15:13:12 +0300
> Subject: [PATCH] mm: add per-user limit on mapping count
>
> We're going to increase per-mm map_count. To avoid non-obvious memory
> abuse by creating a lot of VMA's, let's introduce per-user limit.
>
> The limit is implemented as sysctl. For now value of limit is pretty
> arbitrary -- 2^20.
>
> sizeof(vm_area_struct) with my kernel config (DEBUG_KERNEL=n) is 184
> bytes. It means with the limit user can use up to 184 MiB of RAM in
> VMAs.
>
> The limit is not applicable for root (INIT_USER).

I don't like this.

Maybe we could just account VMAs into OOM-badness points and let
OOM-killer do its job?

--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -170,7 +170,9 @@ unsigned long oom_badness(struct task_struct *p,
struct mem_cgroup *memcg,
* task's rss, pagetable and swap space use.
*/
points = get_mm_rss(p->mm) + atomic_long_read(&p->mm->nr_ptes) +
- get_mm_counter(p->mm, MM_SWAPENTS);
+ get_mm_counter(p->mm, MM_SWAPENTS) +
+ (long)p->mm->map_count *
+ sizeof(struct vm_area_struct) / PAGE_SIZE;
task_unlock(p);

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/