[RFC PATCH V2 1/1] rasdaemon: Fix poll() on per_cpu trace_pipe_raw blocks indefinitely

From: shiju.jose
Date: Sat Feb 04 2023 - 14:34:02 EST


From: Shiju Jose <shiju.jose@xxxxxxxxxx>

The error events are not received in the rasdaemon since kernel 6.1-rc6.
This issue is firstly detected and reported, when testing the CXL error
events in the rasdaemon.

Debugging showed, poll() on trace_pipe_raw in the ras-events.c do not
return and this issue is seen after the commit
42fb0a1e84ff525ebe560e2baf9451ab69127e2b ("tracing/ring-buffer: Have
polling block on watermark").

This also verified using a test application for poll()
and select() on trace_pipe_raw.

There is also a bug reported on this issue,
https://lore.kernel.org/all/31eb3b12-3350-90a4-a0d9-d1494db7cf74@xxxxxxxxxx/

This issue occurs for the per_cpu case, which calls the
ring_buffer_poll_wait(), in kernel/trace/ring_buffer.c, with the
buffer_percent > 0 and then wait until the percentage of pages are
available.The default value set for the buffer_percent is 50 in the
kernel/trace/trace.c. However poll() does not return even met the percentage
of pages condition.

As a fix, rasdaemon set buffer_percent as 0 through the
/sys/kernel/debug/tracing/instances/rasdaemon/buffer_percent, then the
task will wake up as soon as data is added to any of the specific cpu
buffer and poll() on per_cpu/cpuX/trace_pipe_raw does not block
indefinitely.

Dependency on the kernel RFC patch
tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw

Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>

Changes:
RFC V1 -> RFC V2
1. Rename the patch header subject.
2. Changes for the backward compatability to the old kernels.
---
ras-events.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/ras-events.c b/ras-events.c
index 3691311..e505a0e 100644
--- a/ras-events.c
+++ b/ras-events.c
@@ -383,6 +383,8 @@ static int read_ras_event_all_cpus(struct pthread_data *pdata,
int warnonce[n_cpus];
char pipe_raw[PATH_MAX];
int legacy_kernel = 0;
+ int fd;
+ char buf[10];
#if 0
int need_sleep = 0;
#endif
@@ -402,6 +404,26 @@ static int read_ras_event_all_cpus(struct pthread_data *pdata,
return -ENOMEM;
}

+ /* Fix for poll() on the per_cpu trace_pipe and trace_pipe_raw blocks
+ * indefinitely with the default buffer_percent in the kernel trace system,
+ * which is introduced by the following change in the kernel.
+ * https://lore.kernel.org/all/20221020231427.41be3f26@xxxxxxxxxxxxxxxxxx/T/#u.
+ * Set buffer_percent to 0 so that poll() will return immediately
+ * when the trace data is available in the ras per_cpu trace pipe_raw
+ */
+ fd = open_trace(pdata[0].ras, "buffer_percent", O_WRONLY);
+ if (fd >= 0) {
+ /* For the backward compatabilty to the old kernel, do not return
+ * if fail to set the buffer_percent.
+ */
+ snprintf(buf, sizeof(buf), "0");
+ size = write(fd, buf, strlen(buf));
+ if (size <= 0)
+ log(TERM, LOG_WARNING, "can't write to buffer_percent\n");
+ close(fd);
+ } else
+ log(TERM, LOG_WARNING, "Can't open buffer_percent\n");
+
for (i = 0; i < (n_cpus + 1); i++)
fds[i].fd = -1;

--
2.25.1