Re: [malware-list] [RFC 0/5] [TALPA] Intro to a linux interfacefor on access scanning

From: Greg KH
Date: Mon Aug 04 2008 - 18:35:30 EST

Next message: Rafael J. Wysocki: "Re: BUG: scheduling while atomic: ip/23212/0x00000102"
Previous message: Stephen Hemminger: "Re: [RFC] netdev: debugging option"
In reply to: Eric Paris: "[RFC 0/5] [TALPA] Intro to a linux interface for on access scanning"
Next in thread: Christoph Hellwig: "Re: [malware-list] [RFC 0/5] [TALPA] Intro to a linux interfacefor on access scanning"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Aug 04, 2008 at 05:00:16PM -0400, Eric Paris wrote:
> Please contact me privately or (preferably the list) for questions,
> comments, discussions, flames, names, or anything. I'll do complete
> rewrites of the patches if someone tells me how they don't meet their
> needs or how they can be done better. I'm here to try to bridge the
> needs (and wants) of the anti-malware vendors with the technical
> realities of the kernel. So everyone feel free to throw in your two
> cents and I'll try to reconcile it all. These 5 patches are part 1.
> They give us a working able solution.
>
> >From my point of view patches forthcoming and mentioned below should
> help with performance for those who actually have userspace scanners but
> also could presents be implemented using this framework.
>
>
> Background
> ++++++++++
> There is a consensus in the security industry that protecting against
> malicious files (viruses, root kits, spyware, ad-ware, ...) by the way
> of so-called on-access scanning is usable and reasonable approach.
> Currently the Linux kernel does not offer a completely suitable
> interface to implement such security solutions. Present solutions
> involve overwriting function pointers in the LSM, in filesystem
> operations, in the sycall table, and other fragile hacks. The purpose
> of this project is to create a fast, clean interface for userspace
> programs to look for malware when files are accessed. This malware may
> be ultimately intended for this or some other Linux machine or may be
> malware intended to attack a host running a different operating system
> and is merely in transit across the Linux server. Since there are
> almost an infinite number of ways in which information can enter and
> exit a server it is not seen as reasonable to move these checks to all
> the applications at the boundary (MTA, NFS, CIFS, SSH, rsync, et al.) to
> look for such malware on at the border.
>
> For this Linux kernel interface speed is of particular interest for
> those who have it compiled into the kernel but have no userspace client.
> There must be no measurable performance hit to just compiling this into
> the kernel.
>
> Security vendors, Linux distributors and other interested parties have
> come together on the malware-list mailing list to discuss this problem
> and see if they can work together to propose a solution. During these
> talks couple of requirement sets were posted with the aim of fleshing
> out common needs as a prerequisite of creating an interface prototype.

These requirements were posted? Where? I don't recall seeing them.

> Collated requirements
> +++++++++++++++++++++
> 1. Intercept file opens (exec also) for vetting (block until
> decision is made) and allow some userspace black magic to make
> decisions.
> 2. Intercept file closes for scanning post access
> 3. Cache scan results so the same file is not scanned on each and every access
> 4. Ability to flush the cache and cause all files to be re-scanned when accessed
> 5. Define which filesystems are cacheable and which are not
> 6. Scan files directly not relying on path. Avoid races and problems with namespaces, chroot, containers, etc.
> 7. Report other relevant file, process and user information associated with each interception
> 8. Report file pathnames to userspace (relative to process root, current working directory)
> 9. Mark a processes as exempt from on access scanning
> 10. Exclude sub-trees from scanning based on filesystem (exclude procfs, sysfs, devfs)
> 11. Exclude sub-trees from scanning based on filesystem path
> 12. Include only certain sub-trees from scanning based on filesystem path
> 13. Register more than one userspace client in which case behavior is restrictive

I don't see anything in the list above that make this a requirement that
the code to do this be placed within the kernel.

What is wrong with doing it in glibc or some other system-wide library
(LD_PRELOAD hooks, etc.)?

> 1., 2. Basic interception
> -------------------------
> Core requirement is to intercept access to files and prevent it if
> malicious content is detected. This is done on open, not on read. It
> may be possible to do read time checking with minimal performance impact
> although not currently implemented. This means that the following race
> is possible
>
> Process1 Process2
> - open file RD
> - open file WR
> - write virus data (1)
> - read virus data

Wonderful, we are going to implement a solution that is known to not
work, with a trivial way around it?

Sorry, that's not going to fly.

> *note that any open after (1) will get properly vetted. At this time
> the likely hood of this being a problem vs the performance impact of
> scanning on read and the increased complexity of the code means this is
> left out. This should not be a problem for local executables as writes
> to files opened to be run typically return ETXTBSY.

Are you sure about this?

> One of the most important filters in the evaluation chain implements an
> interface through which an userspace process can register and receive
> vetting requests. Userspace process opens a misc character device to
> express its interest and then receives binary structures from that
> device describing basic interception information. After file contents
> have been scanned a vetting response is sent by writing a different
> binary structure back to the device and the intercepted process
> continues its execution. These are not done over network sockets and no
> endian conversions are done. The client and the kernel must have the
> same endian configuration.

How about the same 64/32bit requirement? Your implementation is
incorrect otherwise.

(hint, your current patch is also wrong in this area, you should fix
that up...)

And a binary structure? Ick, are you trying to make it hard for future
expansions and such?

And why not netlink/network socket? Why a character device? You are
already using securityfs, why not use a file node in there?

> 6. Direct access to file content
> --------------------------------
> When an userspace daemon receives a vetting request, it also receives a
> new RO file descriptor which provides direct access to the inode in
> question. This is to enable access to the file regardless of it
> accessibility from the scanner environment (consider process namespaces,
> chroot's, NFS). The userspace client is responsible for closing this
> file when it is finished scanning.

Is this secondary file handle properly checked for the security issues
involved with such a thing? What happens if the userspace client does
not close the file handle?

> 7. Other reporting
> ------------------
> Along with the fd being installed in the scanning process the process
> gets a binary structure of data including:

What's with the love of binary structures? :)

> + uint32_t version;
> + uint32_t type;
> + int32_t fd;
> + uint32_t operation;
> + uint32_t flags;
> + uint32_t mode;
> + uint32_t uid;
> + uint32_t gid;
> + uint32_t tgid;
> + uint32_t pid;

What happens when the world moves to 128bit or 64bit uids? (yes, I've
seen proposals for such a thing...)

Why would userspace care about these meta-file things, what does it want
with them?

> 8. Path name reporting
> ----------------------
> When a malicious content is detected in a file it is important to be
> able to report its location so the user or system administrator can take
> appropriate actions.
>
> This is implemented in a amazingly simple way which will hopefully avoid
> the controversy of some other solutions. Path name is only needed for
> reporting purposes and it is obtained by reading the symlink of the
> given file descriptor in /proc. Its as simple as userspace calling:
>
> snprintf(link, sizeof(link), "/proc/self/fd/%d", details.fd);
> ret = readlink(link, buf, sizeof(buf)-1);

Cute hack. What's to keep it from racing with the fd changing from the
original program?

> 9. Process exclusion
> --------------------
> Sometimes it is necessary to exclude certain processes from being
> intercepted. For example it might be a userspace root kit scanner which
> would not be able to find root kits if access to them was blocked by the
> on-access scanner.
>
> To facilitate that we have created a special file a process can open and
> register itself as excluded. A flag is then put into its kernel
> structure (task_struct) which makes it excluded from scanning.
>
> This implementation is very simple and provides greatest performance. In
> the proposed implementation access to the exclusion device is controlled
> though permissions on the device node which are not sufficient. An LSM
> call will need to be made for this type or access in a later patch.

Heh, so if you want to write a "virus" for Linux, just implement this
flag. What's to keep a "rogue" program from telling the kernel that all
programs on the system are to be excluded?

> 10. Filesystem exclusions
> -------------------------
> One pretty important optimization is not to scan things like /proc, /sys
> or similar. Basically all filesystems where user can not store
> arbitrary, potentially malicious, content could and should be excluded
> from scanning.

Why, does scanning these files take extra time? Just curious.

> 11. Path exclusions
> -------------------
> The need for exclusions can be demonstrated with an example of a MySQL
> server. It's data files are frequently modified which means they would
> need to be constantly rescanned which is very bad for performance. Also,
> it is most often not even possible to reasonably scan them. Therefore
> the best solution is not to scan its database store which can simply be
> implemented by excluding the store subdirectory.
>
> It is a relatively simple implementation which allows run-time
> configuration of a list of sub directories or files to exclude.
> Exclusion paths are relative to each process root. So for example if we
> want to exclude /var/lib/mysql/ and we have a mysql running in a chroot
> where from the outside that directory actually lives
> in /chroot/mysql/var/lib/mysql, /var/lib/mysql should actually be added
> to the exclusion list.
>
> This is also not included in the initial patch set but will be coming
> shortly after.

Again, what's to keep all files to be marked as excluded?

> Closing remarks
> ---------------
> Although some may argue some of the filters are not necessary or may
> better be implemented in userspace, we think it is better to have them
> in kernel primarily for performance reasons.

Why? What numbers do you have that say the kernel is faster in
implementing this? This is the first mention of such a requirement, we
need to see real data to back it up please.

> Secondly, it is all simple code not introducing much baggage or risk
> into the kernel itself.

I disagree, see above.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rafael J. Wysocki: "Re: BUG: scheduling while atomic: ip/23212/0x00000102"
Previous message: Stephen Hemminger: "Re: [RFC] netdev: debugging option"
In reply to: Eric Paris: "[RFC 0/5] [TALPA] Intro to a linux interface for on access scanning"
Next in thread: Christoph Hellwig: "Re: [malware-list] [RFC 0/5] [TALPA] Intro to a linux interfacefor on access scanning"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]