Re: [PATCH v2] tracing: Fix wasted memory in saved_cmdlines logic

From: Steven Rostedt
Date: Mon Feb 12 2024 - 12:23:51 EST


On Tue, 13 Feb 2024 00:40:38 +0900
Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx> wrote:

> > Now, instead of saving only 128 comms by default, by using this wasted
> > space at the end of the structure it can save over 8000 comms and even
> > saves space by removing the need for allocating the other array.
>
> This looks good to me. So this will allocate the saved_cmdlines in page-size
> array instead of the power of two.

Once the size is greater than page-size, it still is a power of two ;-)

But I think you meant that it will be a reminder of extra data, and not a
power of two. Looking at the code, it didn't need to be a power of two as
the saved names were just in a ring buffer anyway. The recording of a PID
must be a power of two, as that is masked.

Basically we have three arrays. One for the PIDS that hold the index into the
COMM array. The COMM array really has no limitation. The PID array is
PID_MAX_DEFAULT. But PIDs can be more than PID_MAX_DEFAULT as that's just
the "default" setting. User space can increase it. I need to add a comment
to how this works, as there are three arrays:

unsigned map_pid_to_cmdline[PID_MAX_DEFAULT + 1];

/* The next two are allocated based on cmdline_num size */
unsigned map_cmdline_to_pid[cmdline_num];
char saved_cmdlines[cmdline_num][TASK_COMM_LEN];

map_pid_to_cmdline[task->pid & (PID_MAX_DEFAULT - 1)] = index into the other two arrays;

[ Although I'm not sure I need that "+1" in the array. This code is ancient
and I need to re-examine it. ]

To get back to the PID, you need the map_cmdline_to_pid[idx] which contains
the full PID. Really, that array is only needed if pid max is greater than
PID_MAX_DEFAULT (which I find is the case for most distros now).

To assign a new comm, we basically do:

idx = map_pid_to_cmdline[pid & (PID_MAX_DEFAULT - 1)];
if (idx == -1) {
idx = // get next location in COMM ring buffer
map_pid_to_cmdline[pid & (PID_MAX_DEFAULT - 1)] = idx;
}
map_cmdline_to_pid[idx] = pid;
strcpy(saved_cmdlines[idx], task->comm);

And Mathieu pinged me about the non-commented use of that NO_CMDLINE_MAP
which is just -1. I have it as INT_MAX, but really maybe it should be -1U,
and add a comment around the memset().

>
> Acked-by: Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>

Thanks, but Linus already pulled this. But I have some patches to clean
this up a bit more.

-- Steve