fuse readdirplus skip one entry when interrupted by signal

From: Jakob Unterwurzacher
Date: Tue Oct 24 2017 - 14:10:59 EST


A user running a Haskell program [1] noticed a problem with fuse's
readdirplus: when it is interrupted by a signal, it skips one
directory entry.

The problem is most apparent with Haskell as it uses
SIGVTALRM to interrupt it's own green threads.

A minimal reproducer in C, "ls-count.c", is available [2]. The problem
has been reproduced against libfuse's "passthrough_fh.c", but also against
gocryptfs, which uses go-fuse instead of libfuse. This suggest
that the bug is in kernel-space, which also the opinion of libfuse
upstream [3].

What "ls-count.c" does is that it loops over readdir while sending itself
SIGVTALRM. When the count of directory entries changes, it exits:

$ ./ls-count b
ls-count: counts do not match: 2 vs 1

strace against ls-count shows that we get one entry, when we should get
two ("." and ".."):

getdents(3, /* 1 entries */, 32768) = 24
--- SIGVTALRM ---
rt_sigreturn({mask=[]}) = 24
getdents(3, /* 0 entries */, 32768) = 0

The debug output from go-fuse [4] shows what seems to be happening:

Dispatch 548: READDIRPLUS, NodeId: 1. data: {Fh 3 off 0 sz 4096}
Serialize 548: READDIRPLUS code: OK value: 320 bytes data
Dispatch 549: READDIRPLUS, NodeId: 1. data: {Fh 3 off 2 sz 4096}
Serialize 549: READDIRPLUS code: OK value:

The kernel starts reading the directory from "off 0", where it is
interrupted, and only returns one entry to userspace. Then it continues
reading at "off 2". Offset 1 is skipped.

I can reliably reproduce this within 1 second against kernel 4.12.5.

Best regards,
Jakob

[1] https://github.com/hanwen/go-fuse/issues/191
[2]
https://gist.githubusercontent.com/rfjakob/79581292a037ae7cb068067cb6207ef8/raw/f71494a291cfded8a96d02c3f0ee2983457591cc/ls-count.c
[3] https://github.com/libfuse/libfuse/issues/214
[4] gocryptfs -fg -fusedebug -nosyslog