Re: fanotify as syscalls

From: Alan Cox
Date: Wed Sep 16 2009 - 08:00:50 EST


> You can't rely on the name being non-racy, but you _can_ reliably
> invalidate application-level caches from the sequence of events
> including file writes, creates, renames, links, unlinks, mounts. And
> revalidate such caches by the absence of pending events.

You can't however create the caches reliably because you've no idea if
you are referencing the right object in the first place - which is why
you want a handle in these cases. I see fanotify as a handle producing
addition to inotify, not as a replacement (plus some other bits around
open blocking for HSM etc)

> Clearly, I'm going to have to explain with working code :-)

Always a good demo

> > but it is somewhat inadequate for indexers
>
> For indexers, the real inadequacy is the need to attach inotify
> watches to every directory at system startup, and to stat() everything
> to check it hasn't changed since the indexer was last running. Both

stat doesn't help you - inode numbers are only guaranteed unqiue (and
constant) while a reference to the object is held.

> Descriptors don't tell you which subtree a file is in any better than
> inotify watches. I.e. they do, if you track them and their containing
> directories all individually.

Don't get me wrong - I don't think fanotify is sufficient on its own -
and this is one reason. Some things care about the namespace, some about
getting the exact content.

> > chroot isn't a security model. You can already do this with AF_UNIX
> > sockets (and there are apps that intentionally use fchdir that way)
>
> Ah, no. AF_UNIX works with explicit sender cooperation.
>
> fanotify gives you access to files without sender cooperation, as it
> intercepts every open().

and is currently not general user accessible for this reason.

> > Inside of containers - unlikely.
>
> Why not? Some people run entire distributions in containiners, and
> present them as VMs to the world for other people to admin.

In a word - performance. In two words performance and security. It isn't
a sensible setup because you want to scan the most efficient way possible
and you want to keep your malware scan as far away from attackers, so it
makes sense to keep it outside of the containers and do the job once
- one scan, one database of things to keep current.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/