Re: Instrumentation and RCU

From: Mathieu Desnoyers
Date: Tue Mar 10 2020 - 13:22:48 EST


----- On Mar 10, 2020, at 12:49 PM, paulmck paulmck@xxxxxxxxxx wrote:

> On Tue, Mar 10, 2020 at 11:13:27AM -0400, Mathieu Desnoyers wrote:
>>
>>
>> ----- On Mar 9, 2020, at 4:47 PM, paulmck paulmck@xxxxxxxxxx wrote:
>> [...]
>>
>> >
>> > Suppose that we had a variant of RCU that had about the same read-side
>> > overhead as Preempt-RCU, but which could be used from idle as well as
>> > from CPUs in the process of coming online or going offline? I have not
>> > thought through the irq/NMI/exception entry/exit cases, but I don't see
>> > why that would be problem.
>> >
>> > This would have explicit critical-section entry/exit code, so it would
>> > not be any help for trampolines.
>> >
>> > Would such a variant of RCU help?
>> >
>> > Yeah, I know. Just what the kernel doesn't need, yet another variant
>> > of RCU...
>>
>> Hi Paul,
>>
>> I think that before introducing yet another RCU flavor, it's important
>> to take a step back and look at the tracer requirements first. If those
>> end up being covered by currently available RCU flavors, then why add
>> another ?
>
> Well, we have BPF requirements as well.
>
>> I can start with a few use-cases I have in mind. Others should feel free
>> to pitch in:
>>
>> Tracing callsite context:
>>
>> 1) Thread context
>>
>> 1.1) Preemption enabled
>>
>> One tracepoint in this category is syscall enter/exit. We should introduce
>> a variant of tracepoints relying on SRCU for this use-case so we can take
>> page faults when fetching userspace data.
>
> Agreed, SRCU works fine for the page-fault case, as the read-side memory
> barriers are in the noise compared to page-fault overhead. Back in
> the day, there were light-weight system calls. Are all of these now
> converted to VDSO or similar?

There is a big difference between allowing page faults to happen, and expecting
page faults to happen every time. I suspect many use-cases will end up having
a fast-path which touches user-space data which is in the page cache, but
may end up triggering page faults in rare occasions.

Therefore, this might justify an SRCU which has low-overhead read-side.

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com