Re: Still OOM problems with 4.9er/4.10er kernels

From: Tetsuo Handa
Date: Thu Mar 23 2017 - 10:48:03 EST


On 2017/03/23 17:38, Mike Galbraith wrote:
> On Thu, 2017-03-23 at 08:16 +0100, Gerhard Wiesinger wrote:
>> On 21.03.2017 08:13, Mike Galbraith wrote:
>>> On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote:
>>>
>>>> Is this the correct information?
>>> Incomplete, but enough to reiterate cgroup_disable=memory
>>> suggestion.
>>>
>>
>> How to collect complete information?
>
> If Michal wants specifics, I suspect he'll ask. I posted only to pass
> along a speck of information, and offer a test suggestion.. twice.
>
> -Mike

Isn't information Mike asked something like output from below command

for i in `find /sys/fs/cgroup/memory/ -type f`; do echo ========== $i ==========; cat $i | xargs echo; done

and check which cgroups stalling tasks belong to? Also, Mike suggested to
reproduce your problem with cgroup_disable=memory kernel command line option
in order to bisect whether memory cgroups are related to your problem.

I don't know whether Michal already knows specific information to collect.
I think memory allocation watchdog might give us some clue. It will give us
output like http://I-love.SAKURA.ne.jp/tmp/serial-20170321.txt.xz .

Can you afford building kernels with watchdog patch applied? Steps I tried for
building kernels are shown below. (If you can't afford building but can afford
trying binary rpms, I can upload them.)

----------------------------------------
yum -y install yum-utils
wget https://dl.fedoraproject.org/pub/alt/rawhide-kernel-nodebug/SRPMS/kernel-4.11.0-0.rc3.git0.1.fc27.src.rpm
yum-builddep -y kernel-4.11.0-0.rc3.git0.1.fc27.src.rpm
rpm -ivh kernel-4.11.0-0.rc3.git0.1.fc27.src.rpm
yum-builddep -y ~/rpmbuild/SPECS/kernel.spec
patch -p1 -d ~/rpmbuild/SPECS/ << "EOF"
--- a/kernel.spec
+++ b/kernel.spec
@@ -24,7 +24,7 @@
%global zipsed -e 's/\.ko$/\.ko.xz/'
%endif

-# define buildid .local
+%define buildid .kmallocwd

# baserelease defines which build revision of this kernel version we're
# building. We used to call this fedora_build, but the magical name
@@ -1207,6 +1207,8 @@

git am %{patches}

+patch -p1 < $RPM_SOURCE_DIR/kmallocwd.patch
+
# END OF PATCH APPLICATIONS

# Any further pre-build tree manipulations happen here.
@@ -1243,6 +1245,8 @@
do
cat $i > temp-$i
mv $i .config
+ echo 'CONFIG_DETECT_MEMALLOC_STALL_TASK=y' >> .config
+ echo 'CONFIG_DEFAULT_MEMALLOC_TASK_TIMEOUT=30' >> .config
Arch=`head -1 .config | cut -b 3-`
make ARCH=$Arch listnewconfig | grep -E '^CONFIG_' >.newoptions || true
%if %{listnewconfig_fail}
EOF
wget -O ~/rpmbuild/SOURCES/kmallocwd.patch 'https://marc.info/?l=linux-mm&m=148957858821214&q=raw'
rpmbuild -bb ~/rpmbuild/SPECS/kernel.spec
----------------------------------------