Re: [RFC PATCH v2 11/35] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs

From: Jonathan Cameron
Date: Thu Nov 30 2023 - 11:46:19 EST


On Fri, 20 Oct 2023 14:44:35 +0100
"Russell King (Oracle)" <linux@xxxxxxxxxxxxxxx> wrote:

> On Thu, Sep 14, 2023 at 01:01:26PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:37:59 +0000
> > James Morse <james.morse@xxxxxxx> wrote:
> >
> > > register_cpu_capacity_sysctl() adds a property to sysfs that describes
> > > the CPUs capacity. This is done from a subsys_initcall() that assumes
> > > all possible CPUs are registered.
> > >
> > > With CPU hotplug, possible CPUs aren't registered until they become
> > > present, (or for arm64 enabled). This leads to messages during boot:
> > > | register_cpu_capacity_sysctl: too early to get CPU1 device!
> > > and once these CPUs are added to the system, the file is missing.
> > >
> > > Move this to a cpuhp callback, so that the file is created once
> > > CPUs are brought online. This covers CPUs that are added late by
> > > mechanisms like hotplug.
> > > One observable difference is the file is now missing for offline CPUs.
> > >
> > > Signed-off-by: James Morse <james.morse@xxxxxxx>
> > > ---
> > > If the offline CPUs thing is a problem for the tools that consume
> > > this value, we'd need to move cpu_capacity to be part of cpu.c's
> > > common_cpu_attr_groups.
> >
> > I think we should do that anyway and then use an is_visible() if we want to
> > change whether it is visible in offline cpus.
> >
> > Dynamic sysfs file creation is horrible - particularly when done
> > from an totally different file from where the rest of the attributes
> > are registered. I'm curious what the history behind that is.
> >
> > Whilst here, why is there a common_cpu_attr_groups which is
> > identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
>
> Looking into doing this, the easy bit is adding the attribute group
> with an appropriate .is_visible dependent on cpu_present(), but we
> need to be able to call sysfs_update_groups() when the state of the
> .is_visible() changes.
Hi Russell,

Sorry, somehow I missed this completely until you referred back to it :(

This is pretty much what I was thinking so thanks for doing it.

>
> Given the comment in sysfs_update_groups() about "if an error occurs",
> rather than making this part of common_cpu_attr_groups, would it be
> better that it's part of its own set of groups, thus limiting the
> damage from a possible error? I suspect, however, that any error at
> that point means that the system is rather fatally wounded.
>
> This is what I have so far to implement your idea, less the necessary
> sysfs_update_groups() call when we need to change the visibility of
> the attributes.

Fwiw (and I think you shouldn't add this to the critical path for your
main series for obvious reasons), I think you are right that it makes
sense to do this in a separate group, but that if we were going to see
an error I'd 'hope' we shouldn't see anything that hasn't occurred
when groups were originally added. Maybe that's overly optimistic.

Sorry again for lack of reply before now and thanks for pointing this
out. I'd love to see this posted after the ARM vCPU HP stuff is in.

Jonathan


>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 9ccb7daee78e..06c9fc6620d2 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -215,43 +215,24 @@ static ssize_t cpu_capacity_show(struct device *dev,
> return sysfs_emit(buf, "%lu\n", topology_get_cpu_scale(cpu->dev.id));
> }
>
> -static void update_topology_flags_workfn(struct work_struct *work);
> -static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
> -
> static DEVICE_ATTR_RO(cpu_capacity);
>
> -static int cpu_capacity_sysctl_add(unsigned int cpu)
> -{
> - struct device *cpu_dev = get_cpu_device(cpu);
> -
> - if (!cpu_dev)
> - return -ENOENT;
> -
> - device_create_file(cpu_dev, &dev_attr_cpu_capacity);
> -
> - return 0;
> -}
> -
> -static int cpu_capacity_sysctl_remove(unsigned int cpu)
> +static umode_t cpu_present_attrs_visible(struct kobject *kobi,
> + struct attribute *attr, int index)
> {
> - struct device *cpu_dev = get_cpu_device(cpu);
> -
> - if (!cpu_dev)
> - return -ENOENT;
> -
> - device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
> + struct device *dev = kobj_to_dev(kobj);
> + struct cpu *cpu = container_of(dev, struct cpu, dev);
>
> - return 0;
> + return cpu_present(cpu->dev.id) ? attr->mode : 0;
> }
>
> -static int register_cpu_capacity_sysctl(void)
> -{
> - cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
> - cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
> +const struct attribute_group cpu_capacity_attr_group = {
> + .is_visible = cpu_present_attrs_visible,
> + .attrs = cpu_capacity_attrs
> +};
>
> - return 0;
> -}
> -subsys_initcall(register_cpu_capacity_sysctl);
> +static void update_topology_flags_workfn(struct work_struct *work);
> +static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
>
> static int update_topology;
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index a19a8be93102..954b045705c2 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -192,6 +192,9 @@ static const struct attribute_group crash_note_cpu_attr_group = {
> static const struct attribute_group *common_cpu_attr_groups[] = {
> #ifdef CONFIG_KEXEC
> &crash_note_cpu_attr_group,
> +#endif
> +#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
> + &cpu_capacity_attr_group,
> #endif
> NULL
> };
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index e117c06e0c6b..745ad21e3dc8 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -30,6 +30,8 @@ struct cpu {
> struct device dev;
> };
>
> +extern const struct attribute_group cpu_capacity_attr_group;
> +
> extern void boot_cpu_init(void);
> extern void boot_cpu_hotplug_init(void);
> extern void cpu_init(void);
>