Re: [PATCH v4 00/10] riscv: Allow userspace to directly access perf counters

From: Atish Patra
Date: Tue Jul 18 2023 - 14:45:37 EST


On Tue, Jul 18, 2023 at 10:06 AM Rémi Denis-Courmont <remi@xxxxxxxxxx> wrote:
>
> Hi,
>
> Le tiistaina 18. heinäkuuta 2023, 2.22.54 EEST Atish Patra a écrit :
> > > AFAIK, if the default settings breaks user space, the patchset is
> > > considered to break user space. That being the case, either this case is
> > > deemed special enough that breaking user space is OK, or it is not.
>
> > This case is a special case as the usage was incorrect in the first
> > place.
>
> I agree that it's not only insecure but also incorrect. However it mostly
> works. In fact I don't disagree with the change as such, but I think that the
> commit messages are misleading and confusing. For a start, in one place it
> says that it is not breaking user space and in another it says basically the
> opposite.
>

Agreed. We will improve the commit message to clarify that. That's also the
reason I started this whole thread :)

> (Unfortunately, not everybody agrees with the change. I can't seem to get
> FFmpeg's checkasm tool fixed:
> http://ffmpeg.org/pipermail/ffmpeg-devel/2023-July/312245.html )
>

Why can't rdtime(equivalent of rdtsc) be used instead of rdcycle ?
What does it use in x86 ? It also doesn't allow reading cycle counter
by default.

The perf syscall overhead is just one time setup thing during the
start of the application.
For counting the cycles before/after a loop, it still provides a
direct CSR access in user mode.

> Also this is not the first time somebody argues that an API should be removed
> because it's broken. That's not special.
>
> > Any application that genuinely requires rdcycle can always get
> > it now via the perf interface.
>
> Sure. But the question is whether it breaks user space and if so, whether
> that's acceptable. Existing apps that call RDCYCLE will now fail, presumbly
> receive SIGILL(?).
>

Yes. With this changes it will receive SIGILL if the default is NO ACCESS.
You can change the sysctl parameter to enable the direct access though
and make it work though.

> > If the insecure and incorrect behavior is allowed, we suspect the user
> > space behavior will never be fixed as it is hard to put a future flag
> > date in these cases.
>
> For better or worse, I can only agree there. But then adding an option to
> preserve the broken behaviour is begging for people to (ab)use it, and indeed
> never fix the problem, and never be able to remove the option.
>

x86 still carries that option. So I don't think once get down path, it
will very difficult to remove it.

> > > If it is not OK, then the only way out that I can think of, consists of
> > > trapping and emulating the counters, returning the same sanitised values
> > > that Linux perf would return. Then you can add a kernel config option to
> > > disable that trap-and-emulation code in the future.
> > What do you mean by "sanitised" value ?
>
> I mean whatever avoids creating a security issue. Presumably report the number
> of cycles spent in the calling thread and in user mode since the first time
> that the process called RDCYCLE?
>
> Maybe it's not reasonable for complexity or performance reasons, but then IMO,
> it deserves a little bit better explaining in the commit message.
>

Yes. I believe the complexities and throwaway code (assuming we should
stop doing that in the long run)
is not worth it given that we have a perfectly valid interface via
perf without any performance sacrifice.
RISC-V is not the first one to do it. It is disabled by default for
ARM64/x86 as well.

If the application usage was legal and we have years of software
development relying on that, it might have
made sense (e.g. x86 legacy usage). However, RISC-V is still young to
avoid those pitfalls.

> --
> 雷米‧德尼-库尔蒙
> http://www.remlab.net/
>
>
>


--
Regards,
Atish