Re: A udev rule to serve the change event of ACPI container?

From: joeyli
Date: Wed Jun 28 2017 - 23:57:58 EST


Hi YASUAKI,

Thanks for your response.

On Wed, Jun 28, 2017 at 03:53:16PM -0400, YASUAKI ISHIMATSU wrote:
>
> On 06/26/2017 02:26 AM, joeyli wrote:
> > Hi all,
> >
> > If ACPI received ejection request for a ACPI container, kernel
> > emits KOBJ_CHANGE uevent when it found online children devices
> > below the acpi container.
> >
> > Base on the description of caa73ea15 kernel patch, user space
> > is expected to offline all devices below the container and the
> > container itself. Then, user space can finalize the removal of
> > the container with the help of its ACPI device object's eject
> > attribute in sysfs.
> >
> > That means that kernel relies on users space to peform the offline
> > and ejection jobs to acpi container and children devices. The
> > discussion is here:
> > https://lkml.org/lkml/2013/11/28/520
> >
> > The mail loop didn't explain why the userspace is responsible for
> > the whole container offlining. Is it possible to do that transparently
> > from the kernel? What's the difference between offlining memory and
> > processors which happends without any cleanup and container which
> > does essentially the same except it happens at once?
>
> We don't know what devices mount on the container device. I think
> devices mount on the container device are different each vendor's server.
>
> If memory device mounts on the container, memory offline easily fails.
> Other devices may have other concerns. So the following udev rule you
> write does not work correctly.
>

IMHO, if the memory hot-remove(offline/ejection) has problem, then we
should report the issue and fix it in mm subsystem. Michal Hocko works
hard on this. I think that the CPU or IO subsystem are the same.

Current kernel can not complete the container hot-remove job without
userspace's involvement. So I sent the udev rule as a example to
response change uevents.

> I think we need to change offline processing for each device. So currently
> the userspace is responsible for the whole container offlining.
>

It depends on what is the expectation of the deivce offline function
in kernel. If a subsystem supports offline in kernel, then it should
not affects the running user space application. Otherwise the issue
should be fixed.

Thanks a lot!
Joey Lee