RE: [PATCH v15-RFC 0/8] Add support for Sub-NUMA cluster (SNC) systems

From: Luck, Tony
Date: Mon Feb 12 2024 - 11:53:03 EST


> >> I actually had specific points that this response also ignores.
> >> Let me repeat and highlight the same points:
> >>
> >> 1) You claim that this series "removes the need for separate domain
> >> lists" ... but then this series does just that (create a separate
> >> domain list), but in an obfuscated way (duplicate the resource to
> >> have the monitoring domain list in there).
> >
> > That was poorly worded on my part. I should have said "removes the
> > need for separate domain lists within a single rdt_resource".
> >
> > Adding an extra domain list to a resource may be the start of a slippery
> > slope. What if there is some additional "L3"-like resctrl operation that
> > acts at the socket level (Intel has made products with multiple L3
> > instances per socket before). Would you be OK add a third domain
> > list to every struct rdt_resource to handle this? Or would it be simpler
> > to just add a new rdt_resource structure with socket scoped domains?
>
> This should not be about what is simplest to patch into current resctrl.

I wanted to offer this in case Boris also thought that the previous version
was too much churn to support an obscure Intel-only (so far) feature.

But if you are going to Nack this new version on the grounds that it muddies
the water about usage of the rdt_domain structure, then I will abandon it.

> There is no need to support a new domain list for a new scope. The domain
> lists support the functionality: control or monitoring. If control has
> socket scope the existing implementation supports that.
> If there is another operation supported by a resource apart from
> control or monitoring then we can consider how to support it when
> we know what it is. That would also be a great point to decide if
> the same data structure should just grow to support an operation that
> not all resources may support. That may depend on the amount of data
> needed to support this hypothetical operation.
>
> >
> >> 2) You claim this series "reduces amount of code churn", but this is
> >> because this series keeps using the same original data structures
> >> for separate monitoring and control usages. The previous series made
> >> an effort to separate the structures for the different usages
> >> but this series does not. What makes it ok in this series to
> >> use the same data structures for different usages?
> >
> > Legacy resctrl has been using the same rdt_domain structure for both
> > usages since the dawn of time. So it has been OK up until now.
>
> This is not the same.
>
> Legacy resctrl uses the same data structure in the same list for both control
> and monitoring usages so it is fine to have both monitoring and control data
> in the data structure.
>
> What you are doing in both solutions is to place the same data structure
> in separate lists for control and monitoring usages. In the one list only the
> control data is used, on the other only the monitoring data is used.
>
> >> Additionally:
> >>
> >> Regarding "Vast amounts of that just added "_mon" or "_ctrl" to structure
> >> or variable names." ... that is because the structures are actually split,
> >> no? It is not just renaming for unnecessary churn.
> >
> > Perhaps not "unnecessary" churn. But certainly a lot of code change for
> > what I perceive as very little real gain.
>
> ok. There may be little gain wrt saving space. One complication with
> this single data structure is that its content may only be decided based
> on which list it is part of. It should be obvious to developers when
> which members are valid. Perhaps this can be addressed with clear
> documentation of the data structures.
>
> >
> >> What is the benefit of keeping the data structures to be shared
> >> between monitor and control usages?
> >
> > Benefit is no code changes. Cost is continuing to waste memory with
> > structures that are slightly bigger than they need to be.
> >
> >> If there is a benefit to keeping these data structures, why not just
> >> address this aspect in previous solution?
> >
> > The previous solution evolved to splitting these structures. But this
> > happened incrementally (remember that at an early stage the monitor
> > structures all got the "_mon" addition to their names, but the control
> > structures kept the original names). Only when I got to the end of this
> > process did I look at the magnitude of the change.
>
> Not answering my question.

I'm not exactly sure what "aspect" you thought could be addressed in the
previous series. But the point is moot now. This diversion from the
series has come to a dead end, and I hope that Boris will look at v14
(either before the next group of ARM patches, or after).

-Tony