Re: [PATCH RESEND v3 13/17] EDAC/mc: Add MC unique index allocation procedure

From: Serge Semin
Date: Wed Oct 12 2022 - 18:30:53 EST


On Wed, Oct 12, 2022 at 10:33:35PM +0200, Borislav Petkov wrote:
> On Wed, Oct 12, 2022 at 11:01:54PM +0300, Serge Semin wrote:
> > The unified approach makes code indeed more readable in the platform
> > drivers and safer since they didn't have to bother with more coding.
> > See for instance the drivers with the static variable-based IDs
> > allocation.
>
> Which drivers? Concrete examples please.

See below.

>
> > Have you read it yourself?
>
> Yes. I even have improved it over the years.
>
> > Here is a short excerpt from there:
> > "Once the problem is established, describe what you are actually doing
> > about it in technical detail. It's important to describe the change
> > in plain English for the reviewer to verify that the code is behaving
> > as you intend it to."
>
> Maybe that part can be misunderstood: "describe what you're doing about
> it". That doesn't mean the text should explain what you're adding and
> how stuff is defined: "It's defined by the EDAC_AUTO_MC_NUM macro." I
> can see that from the diff.
>
> So let me try to explain to you what I'm expecting from commit messages
> in the EDAC tree:
>

> The commit message should explain *why* a change is being done so that,
> months, years from now, when you've gone on to do something else, people
> doing git archeology can actually figure out *why* this change was done.
>
> And the explanation in that commit message should be *complete* and
> should contain *all* necessary information to understand why this change
> was done.

A level of completeness can be relative to each person. For all the
years I've submitting the patches to the kernel I couldn't even
remember the last request to elaborate my logs. In no means I want to
say they were perfect. I could just be too immersed into the problem
so thought that the provided text was descriptive enough especially
for the subsystem maintainer. So to speak asking for more details
would be more than enough.

>
> Your commit message is not explaining the problem.
>
> "In case of the unique index allocation it's not that optimal to always
> rely on the low-level device drivers (platform drivers)"
>

> That's your statement. That needs to have exact details so that people
> can look at that commit message, look at the code which *you* point them
> to in it and go, aha, that is the problem.
>
> "because they get to start to implement either the same design pattern
> (for instance global static MC counter) or may end-up with having
> non-unique index eventually at runtime."
>
> Who are they, exact pointers please.

So you need more details. You should have just asked. I can't read
your mind after all. IMO the description was detailed enough to
understand the problem. Anyway as I already said the current MC
indexing approach wasn't that optimal (always relying on the low-level
driver to allocate the index) because it caused having the same IDx
allocation pattern re-implemented in the drivers. The brightest
example is the drivers with the static variable-based IDs allocation.
It doesn't seem like these drivers bother with the detected DDR
devices order. If so then the automatic IDs allocation is perfect for
them. Instead they can just pass the EDAC_AUTO_MC_NUM id to the
edac_mc_alloc() method and drop the static-based pattern. Thus getting
smaller and more readable drivers code. Moreover the variable
increment isn't atomic. Thus the ID allocation algorithm there is
prone to races should the devices probe is run concurrently.

The last but not least there is no currently way to assign the
controllers ID by means of the DTS file. The suggested patch provides
such functionality by means of the DT aliases.

If it describes the problem better I'll add the text to the patchlog
on the next patchset re-spin.

>
> "The suggested implementation is based on the kernel IDA infrastructure
> exposed by the lib/idr.c driver with API described in linux/idr.h header
> file."
>
> That doesn't matter one bit for the change you're doing. You could have
> added it under the "---" line.

Ok. I'll drop it from the log.

>
> "A new special MC index is introduced here. It's defined by the
> EDAC_AUTO_MC_NUM macro with a value specifically chosen as the least
> probable value used for the real MC index. In case if the EDAC_AUTO_MC_NUM
> index is specified by the EDAC LLDD, the MC index will be either retrieved
> from the MC device OF-node alias index ("mc[:number:]") or automatically
> generated as the next-free MC index found by the ID allocation procedure."
>

> Some of that paragraph should go over the function as a comment - not in
> the commit message as it pertains to what the function does and it would
> make a *lot* more sense there when someone tries to figure out what the
> function does instead of in the commit message.
>

IMO the function isn't that complex to add the comment there. The
semantic can be easily enough inferred from the implementation.

> So, I'm still not convinced why do some EDAC drivers need unique MC
> identifiers, why the current scheme doesn't work and where it doesn't
> work.

Once again. I didn't say it didn't work. I said it wasn't optimal.
Though in some circumstance it can misbehave for some drivers.
Please see my response above.

-Sergey

>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette