Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after soft remove event

From: Prarit Bhargava
Date: Wed Sep 21 2016 - 09:39:44 EST




On 09/21/2016 09:04 AM, Borislav Petkov wrote:
> On Wed, Sep 21, 2016 at 07:39:31AM -0400, Prarit Bhargava wrote:
>> The information in /sys/devices/system/cpu/cpuX/topology
>> directory is useful for userspace monitoring applications and in-tree
>> utilities like cpupower & turbostat.
>>
>> When down'ing a CPU the /sys/devices/system/cpu/cpuX/topology directory is
>> removed during the CPU_DEAD hotplug callback in the kernel. The problem
>> with this model is that the CPU has not been physically removed and the
>> data in the topology directory is still valid. IOW, the cpu is still
>> present but the kernel has removed the topology directory making it
>> very difficult to determine exactly where the cpu is located.
>
> So I'm afraid I still don't understand what the problem here is.
>
> And the commit message of 2/2 doesn't make it any clearer. Can you
> please give a concrete example what the problem is and what you're
> trying to achieve.

Sorry, I'll try again....

Right now, if you removed thread 29 from your system you would do:

echo 0 > /sys/devices/system/cpu/cpu29/online

>From the userspace side, this results in the removing of the

/sys/devices/system/cpu/cpu29/topology

directory.

This is not the right thing to do [1]. The topology directory should exist as
long as the thread is present in the system. The thread (and its core) are
still physically there, it's just that the thread is not available to the
scheduler. The topology of the thread hasn't changed due to it being soft
offlined this way.

turbostat was modified to deal with the missing topology directory, and in tree
utility cpupower prints out significantly less information when a thread is
offline. ISTR a powertop bug due to hotplug too. This makes these monitoring
utilities a problem for users who want only one thread per core.

The patchset does two things. The first patch unifies the topology.c and cpu.c
code. The second patch introduces a config option to change the lifetime of the
topology directory to exist as long as the thread's device struct exists in the
device subsystem.

This now means that

echo 0 > /sys/devices/system/cpu/cpu29/online

will result in the thread's topology directory staying around until the struct
device associated with it is destroyed upon a physical socket hotplug event.

This patchset will result in cleanups to turbostat, and make fixes to cpupower
*much* easier to deal with.

[1] I cannot say with any certainty that other arches do or do not require this
change. That is the only reason the change is restricted to x86 right now.

P.

>
> Thanks.
>