Re: Upcoming: Notifications, FS notifications and fsinfo()

From: Miklos Szeredi
Date: Mon Apr 06 2020 - 05:22:47 EST


On Fri, Apr 3, 2020 at 5:01 PM Lennart Poettering <mzxreary@xxxxxxxxxxx> wrote:
>
> On Fr, 03.04.20 13:48, Miklos Szeredi (miklos@xxxxxxxxxx) wrote:
>
> > > > Does that make any sense?
> > >
> > > When all mounts in the init mount namespace are unmounted and all
> > > remaining processes killed we switch root back to the initrd, so that
> > > even the root fs can be unmounted, and then we disassemble any backing
> > > complex storage if there is, i.e. lvm, luks, raid, â
> >
> > I think it could be done the other way round, much simpler:
> >
> > - switch back to initrd
> > - umount root, keeping the tree intact (UMOUNT_DETACHED)
> > - kill all remaining processes, wait for all to exit
>
> Nah. What I wrote above is drastically simplified. It's IRL more
> complex. Specific services need to be killed between certain mounts
> are unmounted, since they are a backend for another mount. NFS, or
> FUSE or stuff like that usually has some processes backing them
> around, and we need to stop the mounts they provide before these
> services, and then the mounts these services reside on after that, and
> so on. It's a complex dependency tree of stuff that needs to be done
> in order, so that we can deal with arbitrarily nested mounts, storage
> subsystems, and backing services.

That still doesn't explain why you need to keep track of all mounts in
the system.

If you are aware of the dependency, then you need to keep track of
that particular mount. If not, then why?

What I'm starting to see is that there's a fundamental conflict
between how systemd people want to deal with new mounts and how some
other people want to use mounts (i.e. tens of thousands of mounts in
an automount map).

I'm really curious how much the mount notification ring + per mount
query (any implementation) can help that use case.

> Anyway, this all works fine in systemd, the dependency logic is
> there. We want a more efficient way to watch mounts, that's
> all. Subscribing and constantly reparsing /proc/self/mountinfo is
> awful, that's all.

I'm not sure that is all. To handle storms of tens of thousands of
mounts, my guess is that the fundamental way of dealing with these
changes will need to be updated in systemd.

Thanks,
Miklos