Re: [PATCH 4/4] FUSE: implement direct mmap

From: Tejun Heo
Date: Mon Jun 29 2009 - 12:59:31 EST


Hello, Andrew.

Andrew Morton wrote:
>> +static int fuse_mmap_commit_prep(struct fuse_conn *fc, struct fuse_req *req)
>> +{
>> + struct fuse_mmap_commit_in *commit_in = (void *)req->in.args[0].value;
>> + struct file *mfile = req->misc.mmap.file;
>> + int fd;
>> +
>> + if (!mfile)
>> + return 0;
>> +
>> + /* new mmap.file has been created, assign a fd to it */
>> + fd = commit_in->fd = get_unused_fd_flags(O_CLOEXEC);
>> + if (fd < 0)
>> + return 0;
>
> Shouldn't this be `return fd;'?

Non-zero return from prep aborts the command immediately and starts
processing the next command. However, when the fd assigning fails,
the failure should be communicated to the userland server so that it
can release any resource it might be holding for the mmap. So, the
failure state is recorded in commit_in->fd which and the prep callback
returns 0 so that the failure status can be transmitted to the
userland server.

>> + fmmap->vm_ops = *vma->vm_ops;
>> + vma->vm_ops = &fmmap->vm_ops;
>> + vma->vm_ops->open = fuse_vm_open;
>> + vma->vm_ops->close = fuse_vm_close;
>> + vma->vm_private_data = fmmap;
>> + err = 0;
>
> Yes, the vm_operations save/overwrite/restore is the stinky part here.
>
> I'm not completely clear what's going on here, and why, and what the
> overall implications and risks are. Some discussion in the changelog
> would help. With all that information we then would be better situated
> to discuss and perhaps solve <whatever problem this solves> by more
> pleasing means.
>
> I mean, if the need for this hack indicates that there is some
> shortcoming in core VM then perhaps we can improve core VM. Dunno.

Sorry about lack of explanation, in a nutshell, fuse direct mmap
implementation wants to use shmem but wants to know when each mmap is
opened and closed so that it can determine when the mmap is finally
released and tell the userland to release related resources. I think
the following two approaches would be more regular for situations like
this.

1. Export the necessary parts of shmem operations and build our fuse's
own vm_ops using necessary operations.

2. If vma can be nested, create wrapper vma around shmem vma and catch
open/close calls from there.

#1 might be the better solution here but with two separate shmem
implementations it was a bit awkward. #2 currently can't be done, so
I used bastardized form of #1. So, yeap, it's ugly.

Anyways, it looks like the whole fd fiddling and shmem vma trickery
might all go away, so please don't invest too much time into it.

>> + /* missing commit can cause resource leak on server side, don't fail */
>> + req = fuse_get_req_nofail(fc, file);
>
> That's an optimistically-named function.
>
> <looks>
>
> hm, it seems to duplicate mempool.c.

It's much more narrowly defined but yeah in a way.

>> + if (err == -ENOSYS) {
>> + /* Can't provide the coherency needed for MAP_SHARED */
>> + if (vma->vm_flags & VM_MAYSHARE)
>> + return -ENODEV;
>> +
>> + invalidate_inode_pages2(file->f_mapping);
>> +
>> + return generic_file_mmap(file, vma);
>
> Now what's happening on this path?
>
> We seem to have decided to fallback to a standard mmap of some form?
> Why? If the user's request failed, shouldn't we just fail the syscall?
> Hasn't the kernel just lied to userspace about what happened?
>
> Also, this code looks like it's duplicating the behaviour of core MM?
> If so then it needs to be kept in sync as core MM changes - that's a
> maintenance problem. Can it be improved by changing core MM?

That's from the original mmap implementation before the direct mmap is
added. It's just code movement. No functional change by this patch
on this path.

>> --- a/fs/fuse/fuse_i.h
>> +++ b/fs/fuse/fuse_i.h
>> @@ -272,6 +272,13 @@ struct fuse_req {
>> struct fuse_write_out out;
>> } write;
>> struct fuse_lk_in lk_in;
>> + struct {
>> + /** to move filp for mmap between client and server */
>
> Comment pretends to be kerneldoc?

Will fix.

>>
>> +/**
>
> kerneldoc? Does this work?

I have no idea. Will update.

>> + * Mmap flags
>> + *
>> + * FUSE_MMAP_DONT_COPY: don't copy the region on fork
>> + * FUSE_MMAP_DONT_EXPAND: can't be expanded with mremap()
>> + */
>> +#define FUSE_MMAP_DONT_COPY (1 << 0)
>> +#define FUSE_MMAP_DONT_EXPAND (1 << 1)
>> +

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/