Re: [PATCH 1/2] kernfs: add kernfs_ops.free operation to free resources tied to the file

From: Suren Baghdasaryan
Date: Wed Jun 28 2023 - 16:13:18 EST


On Wed, Jun 28, 2023 at 11:42 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jun 28, 2023 at 11:18:20AM -0700, Suren Baghdasaryan wrote:
> > On Wed, Jun 28, 2023 at 11:02 AM Tejun Heo <tj@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Jun 28, 2023 at 07:35:20PM +0200, Christian Brauner wrote:
> > > > > To summarize my understanding of your proposal, you suggest adding new
> > > > > kernfs_ops for the case you marked (1) and change ->release() to do
> > > > > only (2). Please correct me if I misunderstood. Greg, Tejun, WDYT?
> > > >
> > > > Yes. I can't claim to know all the intricate implementation details of
> > > > kernfs ofc but this seems sane to me.
> > >
> > > This is going to be massively confusing for vast majority of kernfs users.
> > > The contract kernfs provides is that you can tell kernfs that you want out
> > > and then you can do so synchronously in a finite amount of time (you still
> > > have to wait for in-flight operations to finish but that's under your
> > > control). Adding an operation which outlives that contract as something
> > > usual to use is guaranteed to lead to obscure future crnashes. For a
> > > temporary fix, it's fine as long as it's marked clearly but please don't
> > > make it something seemingly widely useable.
> > >
> > > We have a long history of modules causing crashes because of this. The
> > > severing semantics is not there just for fun.
> >
> > I'm sure there are reasons things are working as they do today. Sounds
> > like we can't change the ->release() logic from what it is today...
> > Then the question is how do we fix this case needing to release a
> > resource which can be released only when there are no users of the
> > file? My original suggestion was to add a kernfs_ops operation which
> > would indicate there are no more users but that seems to be confusing.
> > Are there better ways to fix this issue?
>
> Just make sure that you really only remove the file when all users are
> done with it? Do you have control of that from the driver side?

I'm a bit confused. In my case it's not a driver, it's the cgroup
subsystem and the issue is not that we are removing the file while
there are other users. The issue is that kernfs today has no operation
which is called when the last user is gone. I need such an operation
to be able to free the resources knowing that no users are left.

>
> But, why is this kernfs file so "special" that it must have this special
> construct? Why not do what all other files that handle polling do and
> just remove and get out of there when done?

AFAIU all other files that handle polling rely on f_op->release()
being called after all the users are gone, therefore they can safely
free their resources. However kernfs can call ->release() while there
are still active users of the file. I can't use that operation for
resource cleanup therefore I was suggesting to add a new operation
which would be called only after the last fput() and would guarantee
no users. Again, I'm not an expert in this, so there might be a better
way to handle it. Please advise.
Thanks,
Suren.

>
> thanks,
>
> greg k-h