Re: Things I wish I'd known about Inotify

From: Michael Kerrisk (man-pages)
Date: Sun Apr 06 2014 - 05:44:56 EST


On 04/04/2014 02:43 PM, Jan Kara wrote:
> On Fri 04-04-14 09:35:50, Michael Kerrisk (man-pages) wrote:
>> On 04/03/2014 10:52 PM, Jan Kara wrote:
>>> On Thu 03-04-14 08:34:44, Michael Kerrisk (man-pages) wrote:

[...]

>>>> Dealing with rename() events
>>>> The IN_MOVED_FROM and IN_MOVED_TO events that are generated by
>>>> rename(2) are usually available as consecutive events when readâ
>>>> ing from the inotify file descriptor. However, this is not guarâ
>>>> anteed. If multiple processes are triggering events for moniâ
>>>> tored objects, then (on rare occasions) an arbitrary number of
>>>> other events may appear between the IN_MOVED_FROM and IN_MOVED_TO
>>>> events.
>>>>
>>>> Matching up the IN_MOVED_FROM and IN_MOVED_TO event pair generâ
>>>> ated by rename(2) is thus inherently racy. (Don't forget that if
>>>> an object is renamed outside of a monitored directory, there may
>>>> not even be an IN_MOVED_TO event.) Heuristic approaches (e.g.,
>>>> assume the events are always consecutive) can be used to ensure a
>>>> match in most cases, but will inevitably miss some cases, causing
>>>> the application to perceive the IN_MOVED_FROM and IN_MOVED_TO
>>>> events as being unrelated. If watch descriptors are destroyed
>>>> and re-created as a result, then those watch descriptors will be
>>>> inconsistent with the watch descriptors in any pending events.
>>>> (Re-creating the inotify file descriptor and rebuilding the cache
>>>> may be useful to deal with this scenario.)
>>> Well, but there's 'cookie' value meant exactly for matching up
>>> IN_MOVED_FROM and IN_MOVED_TO events. And 'cookie' is guaranteed to be
>>> unique at least within the inotify instance (in fact currently it is unique
>>> within the whole system but I don't think we want to give that promise).
>>
>> Yes, that's already assumed by my discussion above (its described elsewhere
>> in the page). But your comment makes me think I should add a few words to
>> remind the reader of that fact. I'll do that.
> Yes, that would be good.
>
>> But, the point is that even with the cookie, matching the events is
>> nontrivial, since:
>>
>> * There may not even be an IN_MOVED_FROM event
>> * There may be an arbitrary number of other events in between the
>> IN_MOVED_FROM and the IN_MOVED_TO.
>>
>> Therefore, one has to use heuristic approaches such as "allow at least
>> N millisconds" or "check the next N events" to see if there is an
>> IN_MOVED_FROM that matches the IN_MOVED_TO. I can't see any way around
>> that being inherently racy. (It's unfortunate that the kernel can't
>> provide a guarantee that the two events are always consecutive, since
>> that would simply user space's life considerably.)

> Yeah, it's unpleasant but doing that would be quite costly/complex at the
> kernel side.

Yep, I imagined that was probably the reason.

> And the race would in the worst case lead to application
> thinking there's been file moved outside of watched area & a file moved
> somewhere else inside the watched area. So the application will have to
> possibly inspect that file. That doesn't seem too bad.

It's actually very bad. See the text above. The point is that one likely
treatment on an IN_MOVED_FROM event that has no IN_MOVED_TO is to remove
the watches for the moved out subtree. If it turns out that this really
was just a rename(), then on the IN_MOVED_TO, the watches will be recreated
*with different watch descriptors*, thus invalidating the watch descriptors
in any queued but as yet unprocessed inotify events. See what I mean?
That's quite painful for user space.

Cheers,

Michael



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/