Re: [PATCH v3 5/5] psi: introduce psi monitor

From: Johannes Weiner
Date: Mon Jan 28 2019 - 16:26:37 EST


One thought on the v3 delta that I missed earlier:

On Thu, Jan 24, 2019 at 01:15:18PM -0800, Suren Baghdasaryan wrote:
> +/*
> + * psi_update_work represents slowpath accounting part while psi_group_change
> + * represents hotpath part. There are two potential races between them:
> + * 1. Changes to group->polling when slowpath checks for new stall, then hotpath
> + * records new stall and then slowpath resets group->polling flag. This leads
> + * to the exit from the polling mode while monitored state is still changing.
> + * 2. Slowpath overwriting an immediate update scheduled from the hotpath with
> + * a regular update further in the future and missing the immediate update.
> + * Both races are handled with a retry cycle in the slowpath:
> + *
> + * HOTPATH: | SLOWPATH:
> + * |
> + * A) times[cpu] += delta | E) delta = times[*]
> + * B) start_poll = (delta[poll_mask] &&| if delta[poll_mask]:
> + * cmpxchg(g->polling, 0, 1) == 0)| F) polling_until = now + grace_period
> + * if start_poll: | if now > polling_until:
> + * C) mod_delayed_work(1) | if g->polling:

With the polling flag being atomic now, this "if g->polling" line
isn't accurate anymore. Since this diagram is specifically about
memory ordering, this should move the g->polling load up to where
delta is read and then refer to unordered local variables down here.