Re: [PATCH 0/5] Candidate fixes for premature OOM kills with node-lru v1

From: Mel Gorman
Date: Thu Jul 21 2016 - 05:15:52 EST


On Thu, Jul 21, 2016 at 04:07:14PM +0900, Minchan Kim wrote:
> Hi Mel,
>
> On Wed, Jul 20, 2016 at 04:21:46PM +0100, Mel Gorman wrote:
> > Both Joonsoo Kim and Minchan Kim have reported premature OOM kills on
> > a 32-bit platform. The common element is a zone-constrained high-order
> > allocation failing. Two factors appear to be at fault -- pgdat being
>
> Strictly speaking, my case is order-0 allocation failing, not high-order.
> ;)
>

I'll update the leader mail.

> > considered unreclaimable prematurely and insufficient rotation of the
> > active list.
> >
> > Unfortunately to date I have been unable to reproduce this with a variety
> > of stress workloads on a 2G 32-bit KVM instance. It's not clear why as
> > the steps are similar to what was described. It means I've been unable to
> > determine if this series addresses the problem or not. I'm hoping they can
> > test and report back before these are merged to mmotm. What I have checked
> > is that a basic parallel DD workload completed successfully on the same
> > machine I used for the node-lru performance tests. I'll leave the other
> > tests running just in case anything interesting falls out.
> >
> > The series is in three basic parts;
> >
> > Patch 1 does not account for skipped pages as scanned. This avoids the pgdat
> > being prematurely marked unreclaimable
> >
> > Patches 2-4 add per-zone stats back in. The actual stats patch is different
> > to Minchan's as the original patch did not account for unevictable
> > LRU which would corrupt counters. The second two patches remove
> > approximations based on pgdat statistics. It's effectively a
> > revert of "mm, vmstat: remove zone and node double accounting by
> > approximating retries" but different LRU stats are used. This
> > is better than a full revert or a reworking of the series as
> > it preserves history of why the zone stats are necessary.
> >
> > If this work out, we may have to leave the double accounting in
> > place for now until an alternative cheap solution presents itself.
> >
> > Patch 5 rotates inactive/active lists for lowmem allocations. This is also
> > quite different to Minchan's patch as the original patch did not
> > account for memcg and would rotate if *any* eligible zone needed
> > rotation which may rotate excessively. The new patch considers
> > the ratio for all eligible zones which is more in line with
> > node-lru in general.
> >
>
> Now I tested and confirmed it works for me at the OOM point of view.
> IOW, I cannot see OOM kill any more. But note that I tested it
> without [1/5] which has a problem I mentioned in that thread.
>
> If you want to merge [1/5], please resend updated version but
> I doubt we need it at this moment.

Currently I'm looking at a version that scales skipped pages as a
partial scan unless the LRU has no eligible pages. I'll put the
patch at the end of the series so that it'll be easier to test
in isolation. I'm currently looking to reproduce a case similar
to Joonsoo's.

--
Mel Gorman
SUSE Labs