Re: [PATCH v5 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()

From: Reinette Chatre
Date: Thu Aug 24 2023 - 19:06:00 EST


Hi James,

On 8/24/2023 9:56 AM, James Morse wrote:
> Hi Reinette,
>
> On 09/08/2023 23:37, Reinette Chatre wrote:
>> On 7/28/2023 9:42 AM, James Morse wrote:
>>> Depending on the number of monitors available, Arm's MPAM may need to
>>> allocate a monitor prior to reading the counter value. Allocating a
>>> contended resource may involve sleeping.
>>>
>>> add_rmid_to_limbo() calls resctrl_arch_rmid_read() for multiple domains,
>>> the allocation should be valid for all domains.
>>>
>>> __check_limbo() and mon_event_count() each make multiple calls to
>>> resctrl_arch_rmid_read(), to avoid extra work on contended systems,
>>> the allocation should be valid for multiple invocations of
>>> resctrl_arch_rmid_read().
>>>
>>> Add arch hooks for this allocation, which need calling before
>>> resctrl_arch_rmid_read(). The allocated monitor is passed to
>>> resctrl_arch_rmid_read(), then freed again afterwards. The helper
>>> can be called on any CPU, and can sleep.
>
>> Looking at the error paths all the errors are silent failures.
>
> Yeah, I don't really expect this to ever fail. The memory arm64 needs to allocate is
> smaller than a pointer - if that fails, I think there are bigger problems. The hardware
> resource is something the call will wait for.
>
> As you note, it's hard to propagate an unlikely error back from here.
>
>
>> On the
>> failure in mon_event_read() this could potentially be handled by setting
>> the "err" field in struct rmid_read ... at least then the caller can print
>> an error instead of displaying a zero count to the user.
>
> Sure, that covers the one a human being might see.

Right.

>> The other failures are harder to handle though.
>
> I don't think the silent failure is such a bad thing. For the limbo handler, no RMID moves
> between the lists until the handler is able to make progress.

ok, so it needs to ensure that the handler is still rescheduled
when such a failure is encountered.

> For the overflow handler, its possible an overflow will get missed (I still have an
> overflow interrupt I can use here). But I don't think this will be the biggest problem on
> a machine that is struggling to allocate 4 bytes.

As I now (I think) better understand for MPAM it is 4 bytes of memory as well as
reservation of a hardware resource. Could something go wrong attempting to find an
available hardware resource that as you state later is indeed scarce? I wonder if
it would not be helpful to at least have resctrl log an error from the
places where it is not possible to propagate the error.

>> Considering that these contexts are allocated and
>> freed so often, why not allocate them once (perhaps in struct rdt_hw_domain?)
>> on driver load with clear error handling?
>
> Because the resource they represent is scarce. You may have 100 control or monitor groups,
> but only 10 hardware monitors. The hardware monitor has to be allocated and programmed
> before it can be read.

I think I misunderstood what "context" is when I wrote the above. I
was thinking about memory allocation that can be done early and
neglected to connect the "context" to be an actual hardware resource.

> This works well for the llc_occupancy counter, but not for bandwidth counters, which with
> the current 'free running' ABI have to all be allocated and programmed at the beginning of
> time. If there are enough monitors to do that - the MPAM driver will, and these
> allocate/free calls will just be looking up the pre-allocated/pre-programmed monitor.
> Doing the allocation like this keeps that logic in the mpam driver, and allows concurrent
> access to resctrl_arch_rmid_read(), which is something any future PMU support will need.
>
> I don't have any numbers how many monitors any platform is going to have, but I'm
> confident there are some that won't have enough for each control-group or monitor-group to
> have one.

Right. My question was not relevant to what this change does. Sorry for the noise.

Reinette