Re: memory-cgroup bug

From: azurIt
Date: Fri Nov 23 2012 - 04:21:33 EST


>Either use gdb YOUR_VMLINUX and disassemble mem_cgroup_handle_oom or
>use objdump -d YOUR_VMLINUX and copy out only mem_cgroup_handle_oom
>function.
If 'YOUR_VMLINUX' is supposed to be my kernel image:

# gdb vmlinuz-3.2.34-grsec-1
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
"/root/bug/vmlinuz-3.2.34-grsec-1": not in executable format: File format not recognized


# objdump -d vmlinuz-3.2.34-grsec-1
objdump: vmlinuz-3.2.34-grsec-1: File format not recognized


# file vmlinuz-3.2.34-grsec-1
vmlinuz-3.2.34-grsec-1: Linux kernel x86 boot executable bzImage, version 3.2.34-grsec (root@server01) #1, RO-rootFS, swap_dev 0x3, Normal VGA

I'm probably doing something wrong :)



It, luckily, happend again so i have more info.

- there wasn't any logs in kernel from OOM for that cgroup
- there were 16 processes in cgroup
- processes in cgroup were taking togather 100% of CPU (it was allowed to use only one core, so 100% of that core)
- memory.failcnt was groving fast
- oom_control:
oom_kill_disable 0
under_oom 0 (this was looping from 0 to 1)
- limit_in_bytes was set to 157286400
- content of stat (as you can see, the whole memory limit was used):
cache 0
rss 0
mapped_file 0
pgpgin 0
pgpgout 0
swap 0
pgfault 0
pgmajfault 0
inactive_anon 0
active_anon 0
inactive_file 0
active_file 0
unevictable 0
hierarchical_memory_limit 157286400
hierarchical_memsw_limit 157286400
total_cache 0
total_rss 157286400
total_mapped_file 0
total_pgpgin 10326454
total_pgpgout 10288054
total_swap 0
total_pgfault 12939677
total_pgmajfault 4283
total_inactive_anon 0
total_active_anon 157286400
total_inactive_file 0
total_active_file 0
total_unevictable 0


i also grabber oom_adj, oom_score_adj and stack of all processes, here it is:
http://www.watchdog.sk/lkml/memcg-bug.tar

Notice that stack is different for few processes. Stack for all processes were NOT chaging and was still the same.

Btw, don't know if it matters but i was several cgroup subsystems mounted and i'm also using them (i was not activating freezer in this case, don't know if it can be active automatically by kernel or what, didn't checked if cgroup was freezed but i suppose it wasn't):
none /cgroups cgroup defaults,cpuacct,cpuset,memory,freezer,task,blkio 0 0

Thank you.

azur
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/