Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush

From: Tycho Andersen
Date: Tue Aug 29 2023 - 13:42:51 EST


On Mon, Aug 21, 2023 at 05:31:48PM +0200, Miklos Szeredi wrote:

(Apologies for the delay, I have been away without cell signal for
some time.)

> > I think the idea is that they're saving snapshots of their own threads
> > to the fs for debugging purposes.
>
> This seems a fairly special situation. Have they (whoever they may
> be) thought about fixing this in their server?

Sorry, "we" here is some internal team that works for my employer
Netflix. We can't use imap clients on our corporate e-mails, whee.

> > Whether this is a sane thing to do or not, it doesn't seem like it
> > should deadlock pid ns destruction.
>
> True. So the suggested solution is to allow wait_event_killable() to
> return if a terminal signal is pending in the exiting state and only
> in that case turn the flush into a background request? That would
> still allow for regressions like the one reported, but that would be
> much less likely to happen in real life. Okay, I said this for the
> original solution as well, so this may turn out to be wrong as well.

I wonder if there's room here for a completion that doesn't use the
wait primitives. Something like an atomic + queuing in task_work()
would both fix this bug and not exhibit this regression, IIUC.

> Anyway, I'd prefer if this was fixed in the server code, as it looks
> fairly special and adding complexity to the kernel for this case might
> not be justifiable. But I'm also open to suggestions on fixing this
> in the kernel in a not too complex manner.

I don't think this is specific to the server-accessing-its-own-file
case. My reproducer uses that because I didn't quite understand the
bug fully at the time. I believe that *any* task that is killed with
an inflight fuse request will exhibit this. We have seen this fairly
rarely on another fuse fs we use throughout the fleet:
https://github.com/lxc/lxcfs and it doesn't really do anything
strange, and is mounted from the host's pid ns.

Tycho