Re: [PATCH v2] RAS: Fix the trace_show() function to output trace_count

From: Borislav Petkov
Date: Mon Oct 17 2022 - 15:40:25 EST


On Mon, Oct 17, 2022 at 04:09:23PM +0000, Luck, Tony wrote:
> Agreed. It needs user to interpret the answer. The filename would lead
> them to think "1" means the daemon is active, but its actually just a count
> of how many times the file is concurrently open (which includes the
> "cat" process reading the file).

Yap, exactly.


> Should have thought of this earlier ... changing user space semantics
> is hard.

AFAIR, at the time we cared only about there being at least one
consumer... thus the binary test, is there at least one or not:

if (!ras_userspace_consumers()) {
print_extlog_rcd(NULL, tmp, cpu);
goto out;
}


> How about:
>
> seq_printf(m, "%d\n", atomic_read(&trace_count) - 1);
>
> with a comment that users reading the file only want to know if anyone
> else has it open?

Yeah, doesn't work either:

# tail -f /sys/kernel/debug/ras/daemon_active &
[1] 3019
1
tail: /sys/kernel/debug/ras/daemon_active: file truncated
1
# cat /sys/kernel/debug/ras/daemon_active
2



We really need something to say, "I really am a RAS events consumer and
not some random file opener."

OTOH, if one does that on ones system, then one has herself to blame
when errors don't get logged and disappear. I mean, why would one even
do that?!

Then again, I've seen weirder stuff so...

Question is, what is your goal with this?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette