Re: splice(-> FIFO) never wakes up inotify IN_MODIFY?

From: Jan Kara
Date: Mon Jun 26 2023 - 12:52:31 EST


On Mon 26-06-23 17:15:23, Ahelenia Ziemiańska wrote:
> On Mon, Jun 26, 2023 at 05:00:01PM +0200, Jan Kara wrote:
> > On Mon 26-06-23 16:25:41, Ahelenia Ziemiańska wrote:
> > > On Mon, Jun 26, 2023 at 03:51:59PM +0200, Jan Kara wrote:
> > > > On Mon 26-06-23 14:57:55, Ahelenia Ziemiańska wrote:
> > > > > On Mon, Jun 26, 2023 at 02:19:42PM +0200, Ahelenia Ziemiańska wrote:
> > > > > > > splice(2) differentiates three different cases:
> > > > > > > if (ipipe && opipe) {
> > > > > > > ...
> > > > > > > if (ipipe) {
> > > > > > > ...
> > > > > > > if (opipe) {
> > > > > > > ...
> > > > > > >
> > > > > > > IN_ACCESS will only be generated for non-pipe input
> > > > > > > IN_MODIFY will only be generated for non-pipe output
> > > > > > >
> > > > > > > Similarly FAN_ACCESS_PERM fanotify permission events
> > > > > > > will only be generated for non-pipe input.
> > > > > Sorry, I must've misunderstood this as "splicing to a pipe generates
> > > > > *ACCESS". Testing reveals this is not the case. So is it really true
> > > > > that the only way to poll a pipe is a sleep()/read(O_NONBLOCK) loop?
> > > > So why doesn't poll(3) work? AFAIK it should...
> > > poll returns instantly with revents=POLLHUP for pipes that were closed
> > > by the last writer.
> > >
> > > Thus, you're either in a hot loop or you have to explicitly detect this
> > > and fall back to sleeping, which defeats the point of polling:
> > I see. There are two ways around this:
> >
> > a) open the file descriptor with O_RDWR (so there's always at least one
> > writer).
> Not allowed in the general case, since you need to be able to tail -f
> files you can't write to.

Hum, fair point.

> > b) when you get POLLHUP, just close the fd and open it again.
> Not allowed semantically, since tail -f follows the file, not the name.

Well, you can workaround that by using /proc/<pid>/fd/ magic links for
reopening.

> > In these cases poll(3) will behave as you need (tested)...
> Alas, those are not applicable to the standard use-case.
> If only linux exposed a way to see if a file was written to!

I agree that having to jump through the hoops with poll for this relatively
standard usage is annoying. Looking into the code, the kernel actually has
extra code to generate these repeated POLLHUPs because apparently that was
how the poll was behaving ages ago.

Hum, researching some more about this, epoll(7) actually doesn't have this
problem. I've tested using epoll(2) (in edge-triggered case) instead of
poll(2) and that doesn't return repeated POLLHUP events.

> For reference with other implementations,
> this just works and is guaranteed to work under kqueue(2) EVFILT_READ
> (admittedly, kqueue(2) is an epoll(7)-style system and not an
> inotify(7)-style one, but it solves the issue,
> and that's what NetBSD tail -f uses).
>
> Maybe this is short-sighted but I don't actually really see why inotify
> is... expected? To only generate file-was-written events only for some
> writes?

Well, inotify similarly as fanotify have been created as filesystem
monitoring APIs. Not as general "file descriptor monitoring" APIs. So they
work well with regular files and directories but for other objects such as
sockets or pipes or even for these "looking like files" objects in virtual
filesystems like /proc, the results are pretty much undefined.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR