[RESEND4, PATCH 0/2] fuse: don't stuck clients on retrieve_notify with size > max_write

From: Kirill Smelkov
Date: Wed Mar 27 2019 - 06:44:21 EST


Miklos,

On Thu, Mar 14, 2019 at 01:45:20PM +0300, Kirill Smelkov wrote:
> Miklos,
>
> On Thu, Feb 28, 2019 at 02:47:57PM +0300, Kirill Smelkov wrote:
> > On Thu, Feb 28, 2019 at 09:10:15AM +0100, Miklos Szeredi wrote:
> > > On Wed, Feb 27, 2019 at 9:39 PM Kirill Smelkov <kirr@xxxxxxxxxx> wrote:
> > >
> > > > I more or less agree with this statement. However can we please make the
> > > > breakage to be explicitly visible with an error instead of exhibiting it
> > > > via harder to debug stucks/deadlocks? For example sys_read < max_write
> > > > -> error instead of getting stuck. And if notify_retrieve requests
> > > > buffer larger than max_write -> error or cut to max_write, but don't
> > > > return OK when we know we will never send what was requested to
> > > > filesystem even if it uses max_write sized reads. What is the point of
> > > > breaking in hard to diagnose way when we can make the breakage showing
> > > > itself explicitly? Would a patch for such behaviour accepted?
> > >
> > > Sure, if it's only adds a couple of lines. Adding more than say ten
> > > lines for such a non-bug fix is definitely excessive.
> >
> > Ok, thanks. Please consider applying the following patch. (It's a bit
> > pity to hear the problem is not considered to be a bug, but anyway).
> >
> > I will also send the second patch as another mail, since I could not
> > made `git am --scissors` to apply several patched extracted from one
> > mail successfully.
>
> [...]
>
> On Thu, Mar 07, 2019 at 12:34:21PM +0300, Kirill Smelkov wrote:
> > Ping. Miklos, is there anything wrong with this patch and its
> > second counterpart?
>
> As we were talking here are those patches. The first one cuts notify_retrieve
> request to max_write and is one line only. The second one returns error to
> filesystem server if it is buggy and does sys_read with buffer size <
> max_write. It is 2 lines of code and 7 lines of comments.
>
> I still think that the patches fix real bugs. It is a bug if server behaviour
> is a bit non-confirming or simply on an edge of being correct or questionable,
> and instead of properly getting plain error from kernel, the whole system gets
> stuck. It is a bug because bug amplification factor here is at least one order
> of magnitude instead of staying ~1x.
>
> I'm sending the patches for the third time already, but did not get any
> feedback. Could you please have a look?

It's been ~ 1 month already since we agreed on the approach and initial
postings of the patches that follow the agreed way:

https://lwn.net/ml/linux-fsdevel/20190228114757.GA2796@xxxxxxxxxxxxxxxxxxx/

Since then the patches were resent several times but without getting any
feedback from you.

Is there anything wrong with the patches? Could you please have a look?
I understand everyone is busy but 1 month seems to be too much and I'm
wondering whether maybe my mails got classified as spam or something
else on your side.

Thanks beforehand,
Kirill

Kirill Smelkov (2):
fuse: retrieve: cap requested size to negotiated max_write
fuse: require /dev/fuse reads to have enough buffer capacity as negotiated

fs/fuse/dev.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

--
2.21.0.392.gf8f6787159