Re: [PATCH 2/2 v3] rseq/selftests: test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ

From: Mathieu Desnoyers
Date: Wed Aug 12 2020 - 16:11:08 EST


----- On Aug 10, 2020, at 8:09 PM, Peter Oskolkov posk@xxxxxxxxxx wrote:

> Based on Google-internal RSEQ work done by
> Paul Turner and Andrew Hunter.
>
> This patch adds a selftest for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ.
> The test quite often fails without the previous patch in this patchset,
> but consistently passes with it.
>
> v3: added rseq_offset_deref_addv() to x86_64 to make the test
> more explicit; on other architectures I kept using existing
> rseq_cmpeqv_cmpeqv_storev() as I have no easy way to test
> there. Added a comment explaining why the test works this way.
>
> Signed-off-by: Peter Oskolkov <posk@xxxxxxxxxx>
> ---
> .../selftests/rseq/basic_percpu_ops_test.c | 196 ++++++++++++++++++
> tools/testing/selftests/rseq/rseq-x86.h | 55 +++++
> 2 files changed, 251 insertions(+)
>
> diff --git a/tools/testing/selftests/rseq/basic_percpu_ops_test.c
> b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
> index eb3f6db36d36..c9784a3d19fb 100644
> --- a/tools/testing/selftests/rseq/basic_percpu_ops_test.c
> +++ b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
> @@ -3,16 +3,22 @@
> #include <assert.h>
> #include <pthread.h>
> #include <sched.h>
> +#include <stdatomic.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stddef.h>
> +#include <syscall.h>
> +#include <unistd.h>
>
> #include "rseq.h"
>
> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
>
> +#define MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ (1<<7)
> +#define MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ (1<<8)
> +

No need to define membarrier commands here if we include <linux/membarrier.h>

> struct percpu_lock_entry {
> intptr_t v;
> } __attribute__((aligned(128)));
> @@ -289,6 +295,194 @@ void test_percpu_list(void)
> assert(sum == expected_sum);
> }
>
> +struct test_membarrier_thread_args {
> + int stop;
> + intptr_t percpu_list_ptr;
> +};
> +
> +/* Worker threads modify data in their "active" percpu lists. */
> +void *test_membarrier_worker_thread(void *arg)
> +{
> + struct test_membarrier_thread_args *args =
> + (struct test_membarrier_thread_args *)arg;
> + const int iters = 10 * 1000 * 1000;
> + int i;
> +
> + if (rseq_register_current_thread()) {
> + fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n",
> + errno, strerror(errno));
> + abort();
> + }
> +
> + /* Wait for initialization. */
> + while (!atomic_load(&args->percpu_list_ptr)) {}
> +
> + for (i = 0; i < iters; ++i) {
> + int ret;
> +
> + do {
> + int cpu = rseq_cpu_start();
> +#if defined(__x86_64__)
> + /* For x86_64, we have rseq_offset_deref_addv. */
> + ret = rseq_offset_deref_addv(&args->percpu_list_ptr,
> + 128 * cpu, 1, cpu);
> +#else
> + /*
> + * For other architectures, we rely on the fact that
> + * the manager thread keeps list_ptr alive, so we can
> + * use rseq_cmpeqv_cmpeqv_storev to make sure
> + * list_ptr we got outside of rseq cs is still
> + * "active".
> + */
> + struct percpu_list *list_ptr = (struct percpu_list *)
> + atomic_load(&args->percpu_list_ptr);
> +
> + struct percpu_list_node *node = list_ptr->c[cpu].head;
> + const intptr_t prev = node->data;
> +
> + ret = rseq_cmpeqv_cmpeqv_storev(&node->data, prev,
> + &args->percpu_list_ptr,
> + (intptr_t)list_ptr, prev + 1, cpu);
> +#endif

Please don't special-case the implementation of a test per architecture.

We should instead "skip" (or even fail) the test on architectures that do
not support this, as an incentive for architecture maintainers to implement
the missing APIs in the test.

One way to do this would be to define RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV in the
architecture header, and skip the test if the define is not present.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com