Re: [RFC PATCH v3 0/4] Node Weights and Weighted Interleave

From: Michal Hocko
Date: Thu Nov 02 2023 - 05:30:58 EST


On Thu 02-11-23 14:21:49, Huang, Ying wrote:
> Michal Hocko <mhocko@xxxxxxxx> writes:
>
> > On Tue 31-10-23 12:22:16, Johannes Weiner wrote:
> >> On Tue, Oct 31, 2023 at 04:56:27PM +0100, Michal Hocko wrote:
> > [...]
> >> > Is there any specific reason for not having a new interleave interface
> >> > which defines weights for the nodemask? Is this because the policy
> >> > itself is very dynamic or is this more driven by simplicity of use?
> >>
> >> A downside of *requiring* weights to be paired with the mempolicy is
> >> that it's then the application that would have to figure out the
> >> weights dynamically, instead of having a static host configuration. A
> >> policy of "I want to be spread for optimal bus bandwidth" translates
> >> between different hardware configurations, but optimal weights will
> >> vary depending on the type of machine a job runs on.
> >
> > I can imagine this could be achieved by numactl(8) so that the process
> > management tool could set this up for the process on the start up. Sure
> > it wouldn't be very dynamic after then and that is why I was asking
> > about how dynamic the situation might be in practice.
> >
> >> That doesn't mean there couldn't be usecases for having weights as
> >> policy as well in other scenarios, like you allude to above. It's just
> >> so far such usecases haven't really materialized or spelled out
> >> concretely. Maybe we just want both - a global default, and the
> >> ability to override it locally. Could you elaborate on the 'get what
> >> you pay for' usecase you mentioned?
> >
> > This is more or less just an idea that came first to my mind when
> > hearing about bus bandwidth optimizations. I suspect that sooner or
> > later we just learn about usecases where the optimization function
> > maximizes not only bandwidth but also cost for that bandwidth. Consider
> > a hosting system serving different workloads each paying different
> > QoS.
>
> I don't think pure software solution can enforce the memory bandwidth
> allocation. For that, we will need something like MBA (Memory Bandwidth
> Allocation) as in the following URL,
>
> https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-memory-bandwidth-allocation.html
>
> At lease, something like MBM (Memory Bandwidth Monitoring) as in the
> following URL will be needed.
>
> https://www.intel.com/content/www/us/en/developer/articles/technical/introduction-to-memory-bandwidth-monitoring.html
>
> The interleave solution helps the cooperative workloads only.

Enforcement is an orthogonal thing IMO. We are talking about a best
effort interface.

--
Michal Hocko
SUSE Labs