Re: [PATCH] tmpfs: implement generic xattr support

From: Hugh Dickins
Date: Thu May 12 2011 - 00:20:47 EST


Hi Miklos,

I understand from the mm-commits discussion that you're probably
preparing another version: so let me make just a few comments
on this old one now, in case you can factor them in.

On Tue, 19 Apr 2011, Miklos Szeredi wrote:

> Michal Suchanek <hramrach@xxxxxxxxxx> writes:
> > Applying this patch is not sufficient. Apparently more xattrs are
> > needed but adding them on top of this patch should be easy.
> >
> > The ones mentioned in the overlayfs doc are
> >
> > trusted.overlay.whiteout
> > trusted.overlay.opaque
> >
> > The patch implements security.*
>
> Here's an updated patch. It changes a number of things:
>
> - it implements shmem specific xattr ops instead of using the generic_*
> functions. Which means that there's no back and forth between the
> VFS and the filesystem. I basically copied the btrfs way of doing
> things.
>
> - adds a new config option: CONFIG_TMPFS_XATTR and makes
> CONFIG_TMPFS_POSIX_ACL depend on this. This way xattr support can be
> turned on without also adding ACL support.
>
> - now supports trusted.* namespace needed by overlayfs in addition to
> security.*. Doesn't yet support user.* since that needs more thought
> wrt. kernel memory use.
>
> - supports xattrs on symlinks, again needed by overlayfs
>
> Hugh, Eric, does this look acceptable?

Pretty much, I think. As ever I'm somewhat bemused by these xattrs,
so don't count too much on my opinion. But if it looks good to xattr
people, then it's welcome in tmpfs - thank you. Eric's ack would be good.

Just a few comments...

>
> Thanks,
> Miklos
>
> ---
> From: Eric Paris <eparis@xxxxxxxxxx>
> Subject: tmpfs: implement generic xattr support
>
> This patch implements generic xattrs for tmpfs filesystems. The feodra
> project, while trying to replace suid apps with file capabilities,
> realized that tmpfs, which is used on the build systems, does not
> support file capabilities and thus cannot be used to build packages
> which use file capabilities. Xattrs are also needed for overlayfs.
>
> The xattr interface is a bit, odd. If a filesystem does not implement any
> {get,set,list}xattr functions the VFS will call into some random LSM hooks and
> the running LSM can then implement some method for handling xattrs. SELinux
> for example provides a method to support security.selinux but no other
> security.* xattrs.
>
> As it stands today when one enables CONFIG_TMPFS_POSIX_ACL tmpfs will have
> xattr handler routines specifically to handle acls. Because of this tmpfs
> would loose the VFS/LSM helpers to support the running LSM. To make up for
> that tmpfs had stub functions that did nothing but call into the LSM hooks
> which implement the helpers.
>
> This new patch does not use the LSM fallback functions and instead
> just implements a native get/set/list xattr feature for the full
> security.* and trusted.* namespace like a normal filesystem. This
> means that tmpfs can now support both security.selinux and
> security.capability, which was not previously possible.
>
> The basic implementation is that I attach a:
>
> struct shmem_xattr {
> struct list_head list; /* anchored by shmem_inode_info->xattr_list */
> char *name;
> size_t size;
> char value[0];
> };
>
> Into the struct shmem_inode_info for each xattr that is set. This
> implementation could easily support the user.* namespace as well,
> except some care needs to be taken to prevent large amounts of
> unswappable memory being allocated for unprivileged users.
>
> Signed-off-by: Eric Paris <eparis@xxxxxxxxxx>
> Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxx>
> ---
> fs/Kconfig | 18 ++
> include/linux/shmem_fs.h | 1
> mm/shmem.c | 302 +++++++++++++++++++++++++++++++++++++++--------
> 3 files changed, 273 insertions(+), 48 deletions(-)
>
> Index: linux-2.6/fs/Kconfig
> ===================================================================
> --- linux-2.6.orig/fs/Kconfig 2011-04-19 21:09:33.000000000 +0200
> +++ linux-2.6/fs/Kconfig 2011-04-19 21:09:35.000000000 +0200
> @@ -121,9 +121,25 @@ config TMPFS
>
> See <file:Documentation/filesystems/tmpfs.txt> for details.
>
> +config TMPFS_XATTR
> + bool "Tmpfs extended attributes"
> + depends on TMPFS
> + default y
> + help
> + Extended attributes are name:value pairs associated with inodes by
> + the kernel or by users (see the attr(5) manual page, or visit
> + <http://acl.bestbits.at/> for details).
> +
> + Currently this enables support for the trusted.* and
> + security.* namespaces.
> +
> + If unsure, say N.
> +
> + You need this for POSIX ACL support on tmpfs.
> +

I disagree with Andrew on this, it should default to off, consistent with
the "If unsure, say N" comment. Partly because that's how Linus prefers
it, for good anti-bloat reasons: if people have got along without a
feature for years, whyever should we suddenly default it on? And
partly because TMPFS_POSIX_ACL already defaults to off (as do
EXT2_FS_XATTR and EXT2_FS_POSIX_ACL).

However... if someone already has CONFIG_TMPFS_POSIX_ACL=y in their
old config, it would be nice to make CONFIG_TMPFS_XATTR=y automatically,
so we're not in danger of removing their old functionality. But I haven't
thought of the right way to achieve that - maybe your helpful POSIX ACL
comment is enough, but automatic would be better.

Few people would choose TMPFS_XATTR on and TMPFS_POSIX_ACL off: perhaps it
would work to make CONFIG_TMPFS_XATTR the new config option to cover both,
and select it from config TMPFS_POSIX_ACL (without a prompt) so the
oldconfig propagates correctly. Perhaps. But I haven't tried myself,
and you may be forgiveably disinclined to make config experiments!

> config TMPFS_POSIX_ACL
> bool "Tmpfs POSIX Access Control Lists"
> - depends on TMPFS
> + depends on TMPFS_XATTR
> select GENERIC_ACL
> help
> POSIX Access Control Lists (ACLs) support permissions for users and
> Index: linux-2.6/include/linux/shmem_fs.h
> ===================================================================
> --- linux-2.6.orig/include/linux/shmem_fs.h 2011-04-19 21:09:25.000000000 +0200
> +++ linux-2.6/include/linux/shmem_fs.h 2011-04-19 21:09:35.000000000 +0200
> @@ -19,6 +19,7 @@ struct shmem_inode_info {
> struct page *i_indirect; /* top indirect blocks page */
> swp_entry_t i_direct[SHMEM_NR_DIRECT]; /* first blocks */
> struct list_head swaplist; /* chain of maybes on swap */
> + struct list_head xattr_list; /* list of shmem_xattr */
> struct inode vfs_inode;
> };
>
> Index: linux-2.6/mm/shmem.c
> ===================================================================
> --- linux-2.6.orig/mm/shmem.c 2011-04-19 21:09:25.000000000 +0200
> +++ linux-2.6/mm/shmem.c 2011-04-19 21:09:35.000000000 +0200
...
> -#ifdef CONFIG_TMPFS_POSIX_ACL
> +#ifdef CONFIG_TMPFS_XATTR
> /*
> - * Superblocks without xattr inode operations will get security.* xattr
> - * support from the VFS "for free". As soon as we have any other xattrs
> + * Superblocks without xattr inode operations may get some security.* xattr
> + * support from the LSM "for free". As soon as we have any other xattrs
> * like ACLs, we also need to implement the security.* handlers at
> * filesystem level, though.
> */
>
> -static size_t shmem_xattr_security_list(struct dentry *dentry, char *list,
> - size_t list_len, const char *name,
> - size_t name_len, int handler_flags)
> +static int shmem_xattr_get(struct dentry *dentry, const char *name,
> + void *buffer, size_t size)
> {
> - return security_inode_listsecurity(dentry->d_inode, list, list_len);
> -}
> + struct shmem_inode_info *info;
> + struct shmem_xattr *xattr;
> + int ret = -ENODATA;
>
> -static int shmem_xattr_security_get(struct dentry *dentry, const char *name,
> - void *buffer, size_t size, int handler_flags)
> -{
> - if (strcmp(name, "") == 0)
> - return -EINVAL;
> - return xattr_getsecurity(dentry->d_inode, name, buffer, size);
> + info = SHMEM_I(dentry->d_inode);
> +
> + spin_lock(&dentry->d_inode->i_lock);

Not important, but I suggest you use info->lock throughout for this,
instead of dentry->d_inode->i_lock: in each case you need "info" for
info->xattr_list (and don't need "inode" at all I think), so info->lock
seems appropriate, and may be in the same cacheline once I make
shmem_inode_info smaller. But don't worry if you'd prefer to leave it.

> -static int shmem_xattr_security_set(struct dentry *dentry, const char *name,
> - const void *value, size_t size, int flags, int handler_flags)
> +static int shmem_xattr_set(struct dentry *dentry, const char *name,
> + const void *value, size_t size, int flags)
> {
> - if (strcmp(name, "") == 0)
> - return -EINVAL;
> - return security_inode_setsecurity(dentry->d_inode, name, value,
> - size, flags);
> + struct inode *inode = dentry->d_inode;
> + struct shmem_inode_info *info = SHMEM_I(inode);
> + struct shmem_xattr *xattr;
> + struct shmem_xattr *new_xattr = NULL;
> + size_t len;
> + int err = 0;
> +
> + /* value == NULL means remove */
> + if (value) {
> + /* wrap around? */
> + len = sizeof(*new_xattr) + size;
> + if (len <= sizeof(*new_xattr))
> + return -ENOMEM;
> +
> + new_xattr = kmalloc(len, GFP_NOFS);

There's no need for GFP_NOFS in this patch, the page reclaim path won't
ever recurse into xattr handling: just use the usual GFP_KERNEL throughout.

> + if (!new_xattr)
> + return -ENOMEM;
> +
> + new_xattr->name = kstrdup(name, GFP_NOFS);
> + if (!new_xattr->name) {
> + kfree(new_xattr);
> + return -ENOMEM;
> + }
> +
> + new_xattr->size = size;
> + memcpy(new_xattr->value, value, size);
> + }
> +
> + spin_lock(&inode->i_lock);
> + list_for_each_entry(xattr, &info->xattr_list, list) {
> + if (!strcmp(name, xattr->name)) {
> + if (flags & XATTR_CREATE) {
> + xattr = new_xattr;
> + err = -EEXIST;
> + } else if (new_xattr) {
> + list_replace(&xattr->list, &new_xattr->list);
> + } else {
> + list_del(&xattr->list);
> + }
> + goto out;
> + }
> + }
> + if (flags & XATTR_REPLACE) {
> + xattr = new_xattr;
> + err = -ENODATA;
> + } else {
> + list_add(&new_xattr->list, &info->xattr_list);
> + xattr = NULL;
> + }
> +out:
> + spin_unlock(&inode->i_lock);
> + if (xattr)
> + kfree(xattr->name);
> + kfree(xattr);
> + return 0;

I think you meant to return err rather than always 0.

...

> +static ssize_t shmem_listxattr(struct dentry *dentry, char *buffer, size_t size)
> +{
> + bool trusted = capable(CAP_SYS_ADMIN);
> + struct shmem_xattr *xattr;
> + struct shmem_inode_info *info;
> + size_t used = 0;
> +
> + info = SHMEM_I(dentry->d_inode);
> +
> + spin_lock(&dentry->d_inode->i_lock);
> + list_for_each_entry(xattr, &info->xattr_list, list) {
> + /* skip "trusted." attributes for unprivileged callers */
> + if (!trusted && xattr_is_trusted(xattr->name))
> + continue;
> +
> + used += strlen(xattr->name) + 1;
> + if (buffer) {
> + if (size < used) {
> + used = -ERANGE;
> + break;
> + }
> + strncpy(buffer, xattr->name, strlen(xattr->name) + 1);
> + buffer += strlen(xattr->name) + 1;
> + }
> + }
> + spin_unlock(&dentry->d_inode->i_lock);
> +
> + return used;
> +}
> #endif

#endif /* CONFIG_TMPFS_XATTR */

would be welcome by the time we get here.

Thanks,
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/