Re: [PATCH v9 7/8] kallsyms: add /proc/kallmodsyms for text symbol disambiguation

From: Nick Alcock
Date: Mon Nov 14 2022 - 11:59:07 EST


On 13 Nov 2022, Luis Chamberlain said:

> On Wed, Nov 09, 2022 at 01:41:31PM +0000, Nick Alcock wrote:
>> This helps disambiguate symbols with identical names when some are in
>> built-in modules are some are not, but if symbols are still ambiguous,
>> {object file names} are added as needed to disambiguate them.
>
> *Why* would we ever want to trouble ourselves with expanding all this
> data into the kernel? The commit log does a poor effort to describe
> any value-add doing this could ever have.

Er... the cover letter says:

> The whole point of symbols is that their names are unique: you can look up a
> symbol and get back a unique address, and vice versa. Alas, because
> /proc/kallsyms (rightly) reports all symbols, even hidden ones, it does not
> really satisfy this requirement. Large numbers of symbols are duplicated
> many times (just search for __list_del_entry!), and while usually these are
> just out-of-lined things defined in header files and thus all have the same
> implementation, it does make it needlessly hard to figure out which one is
> which in stack dumps, when tracing, and such things. Right now the kernel
> has no way at all to tell these apart, and nor has the user: their address
> differs and that's all. Which module did they come from? Which object
> file? We don't know. Figuring out which is which when tracing needs a
> combination of guesswork and luck. In discussions at LPC it became clear
> that this is not just annoying me but Steve Rostedt and others, so it's
> probably desirable to fix this.

This *is* the motivation. Previous iterations of this series only added
module names, but that doesn't disambiguate all symbols, and only
*partially* disambiguating symbols isn't really much use. If all symbols
can be completely unambiguously identified (via a triplet of (name,
module, translation unit), and mapped to a single address, you can be
sure that you can unambiguously cite a single such triple and get a
single address back, and vice versa: e.g. trace output could finally
give you names that you could be sure came from one specific place, and
thus often with one particular caller, even if that symbol appears in
fifty different places in the kernel with callers in fifty different
translation units that do quite different things.

(Plus, with notational additions in tracers, you could in future use
this to trace, say, only *one* instance of __list_del_entry, rather than
being forced to either trace all of them or none, or guess which entry
was which and do a tiresome binary search of repeated traces to get the
right one after lots of trials.)

(And also it's not actually that much data any more: 10KiB or so. :) )

I can add some of this to the commit log too if you like. (As noted in
earlier messages -- which you haven't yet had time to read -- I was
trying to keep that sort of duplication down, perhaps unwisely.)

>> I am not wedded to the name or format of /proc/kallmodsyms, but felt it
>> best to split it out of /proc/kallsyms to avoid breaking existing
>> kallsyms parsers.
>
> I'd like much more review from other parties other than Oracle on this then.

Well, yes. That's what these postings are all about. If I was supposed
to get review from someone else as well, I'm happy to add those people
to the Cc: of future iterations.