Re: [RFC] situation with fput() locking (was Re: [PULL REQUEST] :ima-appraisal patches)

From: Al Viro
Date: Fri Apr 20 2012 - 04:09:46 EST

On Thu, Apr 19, 2012 at 07:58:57PM -0700, Linus Torvalds wrote:
> On Thu, Apr 19, 2012 at 7:54 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Umm... ?I really wonder if we *want* filp_close() under any kind of
> > locks. ?You are right - it should not be deferred. ?I haven't finished
> > checking the callers of that puppy, but if we really do it while holding
> > any kind of lock, we are asking for trouble. ?So I'd rather switch
> > filp_close() to use of fput_nodefer() if that turns out to be possible.
> Ok, fair enough, looks like a reasonable plan to me.

Actually, scratch that; I think I have a better variant
* switch to fget_light/fput_light where possible; it's not needed
for the rest, but is useful anyway
* move the guts of filp_close() (everything prior to fput() it does
in the end) into a new helper, turning filp_close() into a couple of calls
(inlined, at that). Equivalent transformation.
* add fput_nodefer(); does the same thing fput() does now. Make
fput() defer the call of __fput(). Add a filp_close_nodefer() - same as
filp_close(), but the second call in it is fput_nodefer(), not fput(). And
switch the callers that are known to be done outside of any locks to
filp_close_nodefer(). Switch accept4()/socketpair() to fput_nodefer()
(for freshly created struct file in case of cleanup on failure exit).

Note that we do *not* need to bother with fput_light() - even if it does
fput(), that fput() won't usually be the final one. We'd need it to
race with close() or dup2() from another thread for that to happen. So
we only need to review the callers of filp_close() and there are much
fewer of those.

We also get something else out of that - AFAICS, the kludge in __scm_destroy()
can be killed after that. We did it to prevent recursion on fput(), right?
Now that recursion will be gone...

I'd probably not bother with trying to be clever in doing the deferral itself -
the point is to make it rare, so it's not a hot path anyway. We can play
with per-CPU lists, etc., but in this case dumber is better.

Comments? I'm half-asleep right now, so I might be missing something
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at