Re: [patch 0/5] optionally sync per-CPU vmstats counter on return to userspace

From: Marcelo Tosatti
Date: Fri Jul 02 2021 - 11:36:08 EST



Hi Frederic,

On Fri, Jul 02, 2021 at 02:30:32PM +0200, Frederic Weisbecker wrote:
> On Thu, Jul 01, 2021 at 06:03:36PM -0300, Marcelo Tosatti wrote:
> > The logic to disable vmstat worker thread, when entering
> > nohz full, does not cover all scenarios. For example, it is possible
> > for the following to happen:
> >
> > 1) enter nohz_full, which calls refresh_cpu_vm_stats, syncing the stats.
> > 2) app runs mlock, which increases counters for mlock'ed pages.
> > 3) start -RT loop
> >
> > Since refresh_cpu_vm_stats from nohz_full logic can happen _before_
> > the mlock, vmstat shepherd can restart vmstat worker thread on
> > the CPU in question.
> >
> > To fix this, optionally sync the vmstat counters when returning
> > from userspace, controllable by a new "vmstat_sync" isolcpus
> > flags (default off).
>
> Wasn't the plan for such finegrained isolation features to do it at
> the per task level using prctl()?

Yes, but its orthogonal: when we integrate the finegrained isolation
interface, will be able to use this code (to sync vmstat counters
on return to userspace) only when userspace informs that it has entered
isolated mode, so you don't incur the performance penalty of frequent
vmstat counter writes when not using isolated apps.

This is what the full task isolation task patchset mode is doing
as well (CC'ing Alex BTW).

This will require modifying applications (and the new kernel with the
exposed interface).

But there is demand for fixing this now, for currently existing
binary only applications.