Re: [PATCH V4 00/20] The Runtime Verification (RV) interface

From: Song Liu
Date: Wed Jun 22 2022 - 03:24:19 EST


Hi Daniel,

On Thu, Jun 16, 2022 at 1:45 AM Daniel Bristot de Oliveira
<bristot@xxxxxxxxxx> wrote:
>
> Over the last years, I've been exploring the possibility of
> verifying the Linux kernel behavior using Runtime Verification.
>
> Runtime Verification (RV) is a lightweight (yet rigorous) method that
> complements classical exhaustive verification techniques (such as model
> checking and theorem proving) with a more practical approach for complex
> systems.
>
> Instead of relying on a fine-grained model of a system (e.g., a
> re-implementation a instruction level), RV works by analyzing the trace of the
> system's actual execution, comparing it against a formal specification of
> the system behavior.
>
> The usage of deterministic automaton for RV is a well-established
> approach. In the specific case of the Linux kernel, you can check how
> to model complex behavior of the Linux kernel with this paper:
>
> DE OLIVEIRA, Daniel Bristot; CUCINOTTA, Tommaso; DE OLIVEIRA, Romulo Silva.
> *Efficient formal verification for the Linux kernel.* In: International
> Conference on Software Engineering and Formal Methods. Springer, Cham, 2019.
> p. 315-332.
>
> And how efficient is this approach here:
>
> DE OLIVEIRA, Daniel B.; DE OLIVEIRA, Romulo S.; CUCINOTTA, Tommaso. *A thread
> synchronization model for the PREEMPT_RT Linux kernel.* Journal of Systems
> Architecture, 2020, 107: 101729.
>
> tlrd: it is possible to model complex behaviors in a modular way, with
> an acceptable overhead (even for production systems). See this
> presentation at 2019's ELCE: https://www.youtube.com/watch?v=BfTuEHafNgg
>
> Here I am proposing a more practical approach for the usage of deterministic
> automata for runtime verification, and it includes:
>
> - An interface for controlling the verification;
> - A tool and set of headers that enables the automatic code
> generation of the RV monitor (Monitor Synthesis);
> - Sample monitors to evaluate the interface;
> - A sample monitor developed in the context of the Elisa Project
> demonstrating how to use RV in the context of safety-critical
> systems.
>
> Given that RV is a tracing consumer, the code is being placed inside the
> tracing subsystem (Steven and I have been talking about it for a while).

This is interesting work!

I applied the series on top of commit 78ca55889a549a9a194c6ec666836329b774ab6d
in upstream. Then, I got some compile/link error for CONFIG_RV_MON_WIP and
CONFIG_RV_MON_SAFE_WTD. I was able to compile the kernel with these two
configs disabled. However, I hit the some issue with monitors/wwnr/enabled :

[root@eth50-1 ~]# cd /sys/kernel/debug/tracing/rv/
[root@eth50-1 rv]# cat available_monitors
wwnr
[root@eth50-1 rv]# echo wwnr > enabled_monitors
[root@eth50-1 rv]# cd monitors/
[root@eth50-1 monitors]# cd wwnr/
[root@eth50-1 wwnr]# ls
desc enable reactors
[root@eth50-1 wwnr]# cat enable
1
[root@eth50-1 wwnr]# echo 0 > enable <<< hangs

The last echo command hangs forever on a qemu vm. I haven't figured out why
this happens though.

I also have a more general question: can we do RV with BPF and simplify the
work? AFAICT, the idea of RV is to maintain a state machine based on events.
If something unexpected happens, call the reactor.

IIUC, BPF has most of these building blocks ready for use. With BPF, we
can ship many RV monitors without much kernel changes.

Here is my toy wwnr in bpftrace. The reactor is "print to console".
It runs on most systems with BPF and tracepoint enabled. I probably
missed some events, as a result, the script triggers the "reactor" a lot.

=============== 8< ======================
[root@ ~]# cat wwnr.bt
/*
* task_state[pid]
* not_running = 1
* running = 2
*/
tracepoint:sched:sched_switch
{
if (args->prev_state == 0x0001 /* TASK_INTERRUPTIBLE */) {
/* after first suspension */
@task_state[args->prev_pid] = 1;
} else {
if (@task_state[args->prev_pid] == 1) {
printf("Something wrong, call reactor\n");
}
@task_state[args->prev_pid] = 1;
}
@task_state[args->next_pid] = 2;
}

tracepoint:sched:sched_wakeup
{
if (@task_state[args->pid] == 2) {
printf("Something wrong, call reactor\n");
}
@task_state[args->pid] = 2;
}

[root@ ~]# bpftrace wwnr.bt
<<<< some print >>>>
=============== 8< ======================

Does this (BPF for RV) make any sense?

Thanks,
Song