Re: WMI probe failure when reprobing

From: Armin Wolf
Date: Wed Jul 26 2023 - 05:37:01 EST


Am 26.07.23 um 10:01 schrieb Hans de Goede:

Hi,

On 7/25/23 20:01, Armin Wolf wrote:
Am 25.07.23 um 17:07 schrieb Hans de Goede:

Hi Armin,

On 7/22/23 02:09, Armin Wolf wrote:
Hello,

i just noticed that under some circumstances, ACPI WMI devices might fail to reprobe
when being manually unbound and later rebound.
Example:

1. ACPI WMI device #1 binds and registers WMI device with GUID
"05901221-D566-11D1-B2F0-00A0C9062910", resulting in the device
being named "05901221-D566-11D1-B2F0-00A0C9062910".
2. ACPI WMI device #2 binds and registers WMI device with GUID
"05901221-D566-11D1-B2F0-00A0C9062910", resulting in the device
being named "05901221-D566-11D1-B2F0-00A0C9062910-1".
3. ACPI WMI device #1 is manually unbound and later rebound,
now the WMI device with GUID "05901221-D566-11D1-B2F0-00A0C9062910"
is being named "05901221-D566-11D1-B2F0-00A0C9062910-1" too, since
device naming depends on the number of GUIDs currently known to
the WMI subsystem.
4. A WMI device named "05901221-D566-11D1-B2F0-00A0C9062910-1" already
exists, causing the registration of the new WMI device to fail.

I thought about some possible ways to solve this naming issue:

- symlinks to simulate old WMI devices names, new WMI device names similar to "wmiX" with X being a global unique id
- no symlinks, new WMI device names similar to "wmiX" with X being a global unique id
- use global id instead of GUID number

The first approach has full sysfs backward compatibility but i do not know how to create symlinks inside the "devices"
directory. The second approach is the easiest and cleanest one, but provides no sysfs backward compatibility. The last
approach provides only limited sysfs backward compatibility and only for programs which can handle "<GUID>-X" WMI device
names.

Currently, there is one single stable sysfs ABI entry concerning the WMI subsystem (for wmi-bmof), and two testing
sysfs ABI entries (dell-wmi-privacy, sbl-fw-update). I do not know of any userspace programs relying on these ABIs,
but i suspect there might be a couple of scripts which might be affected.

Which approach should i take to solve this problem?
The standard approach to get reliable unique ids in the kernel is to use
something like this:

#include <linux/idr.h>

static DEFINE_MUTEX(ida_lock);

struct guid_data {
    guid_t guid;
    struct ida ida;
    struct list_head list;
};

int guid_init() {
    ida_init(&guid_data->ida);
}

int wmi_create_device()
{
    int index;
    ...
    mutex_lock(&ida_lock);
    index = ida_alloc(&guid_data->ida, GFP_KERNEL);
    mutex_unlock(&ida_lock);
    if (index < 0)
        return index;

    // store index for use on acpi_wmi_remove
    wmi_block->index = index;
    // use index to generate name, don't add -%d for index==0
    ...
}

static void wmi_dev_release(struct device *dev)
{
         struct wmi_block *wblock = dev_to_wblock(dev);

    mutex_lock(&ida_lock);
    ida_free(&guid_data->ida, wblock->index);
    mutex_unlock(&ida_lock);
         kfree(wblock);
}


This is going to need a linked-list of struct guid_data
structs and a new wmi_get_guid_data() function which
takes a new global mutex to protect the list and then
first walks that list looking for a guid match

If nothing is found kzalloc a new struct, init
the ida struct and add it to the list before releasing
the mutex protecting the list.

At the end of wmi_get_guid_data() return the found
or created struct guid_data or NULL on kzalloc error.

And in wmi_create_device() and wmi_dev_release()
use this to get a struct_guid_data matching the wblock
GUID so that we have 1 ida struct per GUID.

I would not worry about releasing the struct guid_data
if somehow the last wblock with that GUID disappears
chances are we are going to need it again soon and
the ida id-array will be empty then so we will start
with a clean-slate if we just re-use the old one
when a new wblock with the same GUID shows up again.

###

Not the prettiest with the need to have a new linked
lists of structs to get a per GUID ida, but it nicely
matches how most subsystems do this so I think it is
best.

I hope this small sketch of what a solution for this
could look like is useful.

Regards,

Hans

Would it be feasible to use the duplicate GUID allowlist instead?
Yes that is a good idea, if you make it non const you can just
store the ida struct there. You'll likely need to add some macro
to init the entries then which also inits the ida struct.

Since the issue does only exists on GUIDs which can be duplicated, only WMI devices handling those
need a unique id after the GUID, the rest keeps the classic WMI device name.
ack.

This would also allow for individual WMI drivers to provide backwards compatibility in case userspace
needs the old WMI device name for the WMI devices they control. If a WMI driver knows that userspace
can handle the new WMI device name for his GUIDs, then they can just add them to the allowlist.
For backward compat I would still omit the "-%d" suffix for the wblock device which gets index 0, that should still be the same one as before.

Regards,

Hans


There might be a misunderstanding here, i meant to propose to avoid this one-ida-for-each-guid thing and instead
change the WMI device naming depending on whether the GUID is inside the allowlist:

- WMI driver cannot handle duplicate GUIDs and userspace requires old WMI device naming -> GUID not in allowlist, old WMI device name
- WMI driver cannot handle duplicate GUIDs and userspace does not require old WMI device naming -> GUID not in allowlist, old WMI device name
- WMI driver can handle duplicate GUIDs and userspace does require old WMI device naming -> GUID not in allowlist, old WMI device name
- WMI driver can handle duplicate GUIDs and userspace does not require old WMI device naming -> GUID in allowlist, new WMI device name

The old WMI device name would be a simple GUID, while the new WMI device name would consist of the GUID and an unique id, which is always present.
This would allow us to avoid another list of GUIDs while making sure that backwards compatibility for userspace is preserved.

This approach is simpler and faster than the list-based approach, and it would allow us to turn the allowlist into and denylist in the future.
What do you think of this approach?

Armin Wolf