Re: [PATCH v2 4/9] perf affinity: Add infrastructure to save/restore affinity

From: Alexey Budankov
Date: Thu Oct 24 2019 - 04:46:16 EST


On 24.10.2019 1:37, Andi Kleen wrote:
> On Wed, Oct 23, 2019 at 09:08:47PM +0300, Alexey Budankov wrote:
>> On 23.10.2019 20:19, Andi Kleen wrote:
>>> On Wed, Oct 23, 2019 at 07:16:13PM +0300, Alexey Budankov wrote:
>>>>
>>>> On 23.10.2019 17:52, Andi Kleen wrote:
>>>>> On Wed, Oct 23, 2019 at 04:30:49PM +0200, Jiri Olsa wrote:
>>>>>> On Wed, Oct 23, 2019 at 06:02:35AM -0700, Andi Kleen wrote:
>>>>>>> On Wed, Oct 23, 2019 at 11:59:11AM +0200, Jiri Olsa wrote:
>>>>>>>> On Sun, Oct 20, 2019 at 10:51:57AM -0700, Andi Kleen wrote:
>>>>>>>>
>>>>>>>> SNIP
>>>>>>>>
>>>>>>>>> +}
>>>>>>>>> diff --git a/tools/perf/util/affinity.h b/tools/perf/util/affinity.h
>>>>>>>>> new file mode 100644
>>>>>>>>> index 000000000000..e56148607e33
>>>>>>>>> --- /dev/null
>>>>>>>>> +++ b/tools/perf/util/affinity.h
>>>>>>>>> @@ -0,0 +1,15 @@
>>>>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>>>>> +#ifndef AFFINITY_H
>>>>>>>>> +#define AFFINITY_H 1
>>>>>>>>> +
>>>>>>>>> +struct affinity {
>>>>>>>>> + unsigned char *orig_cpus;
>>>>>>>>> + unsigned char *sched_cpus;
>>>>>>>>
>>>>>>>> why not use cpu_set_t directly?
>>>>>>>
>>>>>>> Because it's too small in glibc (only 1024 CPUs) and perf already
>>>>>>> supports more.
>>>>>>
>>>>>> nice, we're using it all over the place.. how about using bitmap_alloc?
>>>>>
>>>>> Okay.
>>>>>
>>>>> The other places is mainly perf record from Alexey's recent affinity changes.
>>>>> These probably need to be fixed.
>>>>>
>>>>> +Alexey
>>>>
>>>> Despite the issue indeed looks generic for stat and record modes,
>>>> have you already observed record startup overhead somewhere in your setups?
>>>> I would, first, prefer to reproduce the overhead, to have stable use case
>>>> for evaluation and then, possibly, improvement.
>>>
>>> What I meant the cpu_set usages you added in
>>>
>>> commit 9d2ed64587c045304efe8872b0258c30803d370c
>>> Author: Alexey Budankov <alexey.budankov@xxxxxxxxxxxxxxx>
>>> Date: Tue Jan 22 20:47:43 2019 +0300
>>>
>>> perf record: Allocate affinity masks
>>>
>>> need to be fixed to allocate dynamically, or at least use MAX_NR_CPUs to
>>> support systems with >1024CPUs. That's an independent functionality
>>> problem.
>>
>> Oh, it is clear now. Thanks for pointing this out. For that to move from
>> cpu_mask_t to new custom struct affinity type its API requires extension
>> to provide mask operations similar to the ones that cpu_mask_t provides:
>> CPU_ZERO(), CPU_SET(), CPU_EQUAL(), CPU_OR().
>>
>> For example it could be like: affinity__mask_zero(), affinity__mask_set(),
>> affinity__mask_equal(), affinity__mask_or() and then the collecting part
>> of record could also be moved to struct affinity type and overcome >1024CPUs
>> limitation.
>
> Not sure you need to use my library, except perhaps the get_cpu_set_size()
> function. It is somewhat specialized.

Ok, I see.

>
> Everything else you can use normal Linux bitmap functions,
> or call the sys call directly.

Thanks,
Alexey

>
> -Andi
>