Re: [PATCH v3 12/12] KVM: arm64: selftests: arch_timer: Support vCPU migration

From: Andrew Jones
Date: Tue Sep 07 2021 - 12:20:25 EST


On Tue, Sep 07, 2021 at 09:14:54AM -0700, Raghavendra Rao Ananta wrote:
> On Sun, Sep 5, 2021 at 11:39 PM Andrew Jones <drjones@xxxxxxxxxx> wrote:
> >
> > On Fri, Sep 03, 2021 at 01:53:27PM -0700, Raghavendra Rao Ananta wrote:
> > > On Fri, Sep 3, 2021 at 4:05 AM Andrew Jones <drjones@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Sep 01, 2021 at 09:14:12PM +0000, Raghavendra Rao Ananta wrote:
> > > > > Since the timer stack (hardware and KVM) is per-CPU, there
> > > > > are potential chances for races to occur when the scheduler
> > > > > decides to migrate a vCPU thread to a different physical CPU.
> > > > > Hence, include an option to stress-test this part as well by
> > > > > forcing the vCPUs to migrate across physical CPUs in the
> > > > > system at a particular rate.
> > > > >
> > > > > Originally, the bug for the fix with commit 3134cc8beb69d0d
> > > > > ("KVM: arm64: vgic: Resample HW pending state on deactivation")
> > > > > was discovered using arch_timer test with vCPU migrations and
> > > > > can be easily reproduced.
> > > > >
> > > > > Signed-off-by: Raghavendra Rao Ananta <rananta@xxxxxxxxxx>
> > > > > ---
> > > > > .../selftests/kvm/aarch64/arch_timer.c | 108 +++++++++++++++++-
> > > > > 1 file changed, 107 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > > index 1383f33850e9..de246c7afab2 100644
> > > > > --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > > +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > > @@ -14,6 +14,8 @@
> > > > > *
> > > > > * The test provides command-line options to configure the timer's
> > > > > * period (-p), number of vCPUs (-n), and iterations per stage (-i).
> > > > > + * To stress-test the timer stack even more, an option to migrate the
> > > > > + * vCPUs across pCPUs (-m), at a particular rate, is also provided.
> > > > > *
> > > > > * Copyright (c) 2021, Google LLC.
> > > > > */
> > > > > @@ -24,6 +26,8 @@
> > > > > #include <pthread.h>
> > > > > #include <linux/kvm.h>
> > > > > #include <linux/sizes.h>
> > > > > +#include <linux/bitmap.h>
> > > > > +#include <sys/sysinfo.h>
> > > > >
> > > > > #include "kvm_util.h"
> > > > > #include "processor.h"
> > > > > @@ -41,12 +45,14 @@ struct test_args {
> > > > > int nr_vcpus;
> > > > > int nr_iter;
> > > > > int timer_period_ms;
> > > > > + int migration_freq_ms;
> > > > > };
> > > > >
> > > > > static struct test_args test_args = {
> > > > > .nr_vcpus = NR_VCPUS_DEF,
> > > > > .nr_iter = NR_TEST_ITERS_DEF,
> > > > > .timer_period_ms = TIMER_TEST_PERIOD_MS_DEF,
> > > > > + .migration_freq_ms = 0, /* Turn off migrations by default */
> > > >
> > > > I'd rather we enable good tests like these by default.
> > > >
> > > Well, that was my original idea, but I was concerned about the ease
> > > for diagnosing
> > > things since it'll become too noisy. And so I let it as a personal
> > > preference. But I can
> > > include it back and see how it goes.
> >
> > Break the tests into two? One run without migration and one with. If
> > neither work, then we can diagnose the one without first, possibly
> > avoiding the need to diagnose the one with.
> >
> Right. I guess that's where the test's migration switch can come in handy.
> Let's turn migration ON by default. If we see a failure, we can turn
> OFF migration
> and run the tests. I'll try to include some verbose printing for diagnosability.
>

Sounds good. You can also consider using pr_debug if you feel the need to
balance verbosity with diagnostics.

Thanks,
drew