Re: [PATCH 2/3] enhanced syscall ESTALE error handling (v2)

From: Miklos Szeredi
Date: Mon Feb 04 2008 - 12:38:54 EST


> > In FUSE interrupts are sent to userspace, and the filesystem decides
> > what to do with them. So it is entirely possible and valid for a
> > filesystem to ignore an interrupt. If an operation was non-blocking
> > (such as one returning an error), then there would in fact be no
> > purpose in checking interrupts.
> >
> >
>
> Why do you think that it is valid to ignore pending signals?
> You seem to be asserting that it okay for processes to hang,
> uninterruptibly, when accessing files on fuse mounted file
> systems?
>
> Perhaps the right error to return when there is a signal
> pending is EINTR and not ESTALE or some other error? There
> has to be some way for the application to detect that its
> system call was interrupted due to a signal pending.

Traditionally a lot of filesystem related system calls are not
interruptible, and for good reason. For example what happens, if an
app receives a signal, while the filesystem is performing a rename()
request? It would be very confusing if the call returned EINTR, but
the rename would successfully complete regardless.

We had a related problem with the open(O_CREAT) call in fuse, which
was interruptible between the creation and the actual open because of
a design mistake. So it could return EINTR, after the file was
created, and this broke a real world application (don't have details
at hand, but could dig them out if you are interested).

I don't know what NFS does, but returning EINTR without actually
canceling an operation in the server is generally not a good idea.

> > So while sending a signal might reliably work in NFS to break out of
> > the loop, it does not necessarily work for other filesystems, and fuse
> > may not be the only one affected.
> >
> >
>
> Have you noticed another one? I would be happy to chat with the
> developers for that file system to see if this support would
> negatively impact them.

Oh, I have no idea. And I wouldn't want to do a full audit of all the
filesystems to find out. But if you do, please go ahead.

> > A few solutions come to mind, perhaps the best is to introduce a
> > kernel internal errno value (ERETRYSTALE), that forces the relevant
> > system calls to be retried.
> >
> > NFS could transform ESTALE errors to ERETRYSTALE and get the desired
> > behavior, while other filesystems would not be affected.
>
> We don't need more error numbers, we've got plenty already. :-)

That's a rather poor excuse against a simple solution which would
spare us some backward compatibility problems.

> Do you have anything more specific about any real problems?
> I see lots of "mays" and "coulds", but I don't see anything
> that I can do to make this support better.

Implement the above suggestion? Or something else.

Otherwise I have to NAK this patch due to the possibility of it
breaking existing fuse installations.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/