Re: [PATCH v5 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()

From: James Morse
Date: Thu Aug 24 2023 - 12:57:58 EST


Hi Reinette,

On 09/08/2023 23:37, Reinette Chatre wrote:
> On 7/28/2023 9:42 AM, James Morse wrote:
>> Depending on the number of monitors available, Arm's MPAM may need to
>> allocate a monitor prior to reading the counter value. Allocating a
>> contended resource may involve sleeping.
>>
>> add_rmid_to_limbo() calls resctrl_arch_rmid_read() for multiple domains,
>> the allocation should be valid for all domains.
>>
>> __check_limbo() and mon_event_count() each make multiple calls to
>> resctrl_arch_rmid_read(), to avoid extra work on contended systems,
>> the allocation should be valid for multiple invocations of
>> resctrl_arch_rmid_read().
>>
>> Add arch hooks for this allocation, which need calling before
>> resctrl_arch_rmid_read(). The allocated monitor is passed to
>> resctrl_arch_rmid_read(), then freed again afterwards. The helper
>> can be called on any CPU, and can sleep.

> Looking at the error paths all the errors are silent failures.

Yeah, I don't really expect this to ever fail. The memory arm64 needs to allocate is
smaller than a pointer - if that fails, I think there are bigger problems. The hardware
resource is something the call will wait for.

As you note, it's hard to propagate an unlikely error back from here.


> On the
> failure in mon_event_read() this could potentially be handled by setting
> the "err" field in struct rmid_read ... at least then the caller can print
> an error instead of displaying a zero count to the user.

Sure, that covers the one a human being might see.


> The other failures are harder to handle though.

I don't think the silent failure is such a bad thing. For the limbo handler, no RMID moves
between the lists until the handler is able to make progress.
For the overflow handler, its possible an overflow will get missed (I still have an
overflow interrupt I can use here). But I don't think this will be the biggest problem on
a machine that is struggling to allocate 4 bytes.


> Considering that these contexts are allocated and
> freed so often, why not allocate them once (perhaps in struct rdt_hw_domain?)
> on driver load with clear error handling?

Because the resource they represent is scarce. You may have 100 control or monitor groups,
but only 10 hardware monitors. The hardware monitor has to be allocated and programmed
before it can be read.
This works well for the llc_occupancy counter, but not for bandwidth counters, which with
the current 'free running' ABI have to all be allocated and programmed at the beginning of
time. If there are enough monitors to do that - the MPAM driver will, and these
allocate/free calls will just be looking up the pre-allocated/pre-programmed monitor.
Doing the allocation like this keeps that logic in the mpam driver, and allows concurrent
access to resctrl_arch_rmid_read(), which is something any future PMU support will need.

I don't have any numbers how many monitors any platform is going to have, but I'm
confident there are some that won't have enough for each control-group or monitor-group to
have one.


Thanks,

James