Re: [RFC PATCH 0/6] Configurable fair allocation zone policy v3

From: Michal Hocko
Date: Wed Dec 18 2013 - 11:21:20 EST


On Wed 18-12-13 10:18:46, Johannes Weiner wrote:
> On Wed, Dec 18, 2013 at 03:51:11PM +0100, Michal Hocko wrote:
> > On Tue 17-12-13 15:02:10, Johannes Weiner wrote:
> > [...]
> > > +pagecache_mempolicy_mode:
> > > +
> > > +This is available only on NUMA kernels.
> > > +
> > > +Per default, the configured memory policy is applicable to anonymous
> > > +memory, shmem, tmpfs, etc., whereas pagecache is allocated in an
> > > +interleaving fashion over all allowed nodes (hardbindings and
> > > +zone_reclaim_mode excluded).
> > > +
> > > +The assumption is that, when it comes to pagecache, users generally
> > > +prefer predictable replacement behavior regardless of NUMA topology
> > > +and maximizing the cache's effectiveness in reducing IO over memory
> > > +locality.
> >
> > Isn't page spreading (PF_SPREAD_PAGE) intended to do the same thing
> > semantically? The setting is per-cpuset rather than global which makes
> > it harder to use but essentially it tries to distribute page cache pages
> > across all the nodes.
> >
> > This is really getting confusing. We have zone_reclaim_mode to keep
> > memory local in general, pagecache_mempolicy_mode to keep page cache
> > local and PF_SPREAD_PAGE to spread the page cache around nodes.
>
> zone_reclaim_mode is a global setting to go through great lengths to
> stay on local nodes, intended to be used depending on the hardware,
> not the workload.
>
> Mempolicy on the other hand is to optimize placement for maximum
> locality depending on access patterns of a workload or even just the
> subset of a workload. I'm trying to change whether this applies to
> page cache (due to different locality / cache effectiveness tradeoff)
> and we want to provide pagecache_mempolicy_mode to revert in the field
> in case this is a mistake.
>
> PF_SPREAD_PAGE becomes implied per default and should eventually be
> removed.

I guess many loads do not care about page cache locality and the default
spreading would be OK for them but what about those that do care?
Currently we have a per-process (cpuset in fact) flag but this will
change it to all or nothing. Is this really a good step?
Btw. I do not mind having PF_SPREAD_PAGE enabled by default.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/