Re: [RFC 1/5] memcg: synchronize per-zone iterator access by a spinlock

From: Kamezawa Hiroyuki
Date: Tue Nov 13 2012 - 19:03:45 EST


(2012/11/14 0:30), Michal Hocko wrote:
> per-zone per-priority iterator is aimed at coordinating concurrent
> reclaimers on the same hierarchy (or the global reclaim when all
> groups are reclaimed) so that all groups get reclaimed evenly as
> much as possible. iter->position holds the last css->id visited
> and iter->generation signals the completed tree walk (when it is
> incremented).
> Concurrent reclaimers are supposed to provide a reclaim cookie which
> holds the reclaim priority and the last generation they saw. If cookie's
> generation doesn't match the iterator's view then other concurrent
> reclaimer already did the job and the tree walk is done for that
> priority.
>
> This scheme works nicely in most cases but it is not raceless. Two
> racing reclaimers can see the same iter->position and so bang on the
> same group. iter->generation increment is not serialized as well so a
> reclaimer can see an updated iter->position with and old generation so
> the iteration might be restarted from the root of the hierarchy.
>
> The simplest way to fix this issue is to synchronise access to the
> iterator by a lock. This implementation uses per-zone per-priority
> spinlock which linearizes only directly racing reclaimers which use
> reclaim cookies so the effect of the new locking should be really
> minimal.
>
> I have to note that I haven't seen this as a real issue so far. The
> primary motivation for the change is different. The following patch
> will change the way how the iterator is implemented and css->id
> iteration will be replaced cgroup generic iteration which requires
> storing mem_cgroup pointer into iterator and that requires reference
> counting and so concurrent access will be a problem.
>
> Signed-off-by: Michal Hocko <mhocko@xxxxxxx>

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/