Re: [PATCH v6 4/5] selftests/resctrl: Cleanup properly when an error occurs in CAT test

From: Reinette Chatre
Date: Fri Feb 03 2023 - 13:24:53 EST


Hi Shaopeng,

On 1/30/2023 9:46 PM, Shaopeng Tan wrote:
> After creating a child process with fork() in CAT test, if an error
> occurs or a signal such as SIGINT is received, the parent process will
> be terminated immediately, and therefor the child process will not
> be killed and also resctrlfs is not unmounted.
>
> There is a signal handler registered in CMT/MBM/MBA tests, which kills
> child process, unmount resctrlfs, cleanups result files, etc., if a
> signal such as SIGINT is received.
>
> Commonize the signal handler registered for CMT/MBM/MBA tests and reuse
> it in CAT too.
>
> To reuse the signal handler, make the child process in CAT wait to be
> killed by parent process in any case (an error occurred or a signal was
> received), and when killing child process use global bm_pid instead of
> local bm_pid.
>
> Also, since the MBA/MBA/CMT/CAT are run in order, unregister the signal
> handler at the end of each test so that the signal handler cannot be
> inherited by other tests.
>

Great changelog.

...

> @@ -181,28 +180,31 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> strcpy(param.filename, RESULT_FILE_NAME1);
> param.num_of_runs = 0;
> param.cpu_no = sibling_cpu_no;
> + } else {
> + ret = signal_handler_register();
> + if (ret) {
> + kill(bm_pid, SIGKILL);
> + goto out;
> + }
> }
>
> remove(param.filename);
>
> ret = cat_val(&param);
> - if (ret)
> - return ret;
> -
> - ret = check_results(&param);
> - if (ret)
> - return ret;
> + if (ret == 0)
> + ret = check_results(&param);
>
> if (bm_pid == 0) {
> /* Tell parent that child is ready */
> close(pipefd[0]);
> pipe_message = 1;
> if (write(pipefd[1], &pipe_message, sizeof(pipe_message)) <
> - sizeof(pipe_message)) {
> - close(pipefd[1]);
> + sizeof(pipe_message))
> + /*
> + * Just print the error message.
> + * Let while(1) run and wait for itself to be killed.
> + */
> perror("# failed signaling parent process");
> - return errno;
> - }
>
> close(pipefd[1]);
> while (1)
> @@ -222,9 +224,11 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> kill(bm_pid, SIGKILL);
> }
>
> + signal_handler_unregister();

I expected the code to be symmetrical but found that the signal
handler is registered by parent, but unregistered by both the parent
and the child. Could signal_handler_unregister() be moved to the
parent portion of the "if" statement above?

...

> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -476,6 +476,45 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
> exit(EXIT_SUCCESS);
> }
>
> +/*
> + * Register CTRL-C handler for parent, as it has to kill
> + * child process before exiting
> + */
> +int signal_handler_register(void)
> +{
> + struct sigaction sigact;
> + int ret = 0;
> +
> + sigact.sa_sigaction = ctrlc_handler;
> + sigemptyset(&sigact.sa_mask);
> + sigact.sa_flags = SA_SIGINFO;
> + if (sigaction(SIGINT, &sigact, NULL) ||
> + sigaction(SIGTERM, &sigact, NULL) ||
> + sigaction(SIGHUP, &sigact, NULL)) {
> + perror("# sigaction");
> + ret = -1;
> + }
> + return ret;
> +}
> +
> +/*
> + * Reset signal handler to SIG_DFL.
> + * Non-Vaule return because the caller should keep

Typo in "Non-Vaule"

> + * the error code of other path even if sigaction fails.
> + */
> +void signal_handler_unregister(void)
> +{
> + struct sigaction sigact;
> +
> + sigact.sa_handler = SIG_DFL;
> + sigemptyset(&sigact.sa_mask);
> + if (sigaction(SIGINT, &sigact, NULL) ||
> + sigaction(SIGTERM, &sigact, NULL) ||
> + sigaction(SIGHUP, &sigact, NULL)) {
> + perror("# sigaction");
> + }
> +}
> +
> /*
> * print_results_bw: the memory bandwidth results are stored in a file
> * @filename: file that stores the results
> @@ -671,39 +710,28 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param)
>
> ksft_print_msg("Benchmark PID: %d\n", bm_pid);
>
> - /*
> - * Register CTRL-C handler for parent, as it has to kill benchmark
> - * before exiting
> - */
> - sigact.sa_sigaction = ctrlc_handler;
> - sigemptyset(&sigact.sa_mask);
> - sigact.sa_flags = SA_SIGINFO;
> - if (sigaction(SIGINT, &sigact, NULL) ||
> - sigaction(SIGTERM, &sigact, NULL) ||
> - sigaction(SIGHUP, &sigact, NULL)) {
> - perror("# sigaction");
> - ret = errno;
> - goto out;
> - }
> + ret = signal_handler_register();
> + if (ret)
> + goto out1;

Please do not use generic "out1" and "out2" goto labels. Could
you please change them to reflect what is done at that exit?
You could keep "out" but "out2" could be renamed to "unregister"
or something more appropriate.

...

> @@ -761,7 +789,9 @@ int resctrl_val(char **benchmark_cmd, struct resctrl_val_param *param)
> }
> }
>
> -out:
> +out2:
> + signal_handler_unregister();
> +out1:
> kill(bm_pid, SIGKILL);
> umount_resctrlfs();
>

Thank you

Reinette