Re: [PATCH 5/7] FUSE: implement ioctl support

From: Eric W. Biederman
Date: Fri Aug 29 2008 - 15:26:08 EST


Tejun Heo <tj@xxxxxxxxxx> writes:

> Miklos Szeredi wrote:
>> On Fri, 29 Aug 2008, Tejun Heo wrote:
>>> I first used 'server' for userland [FC]USE server but then I noticed
>>> there were places in FUSE they were referred as clients so now I use
>>> 'client' for those and call the app using the FUSE fs the 'caller'.
>>> What are the established terms?
>>
>> Umm
>>
>> - userspace filesystem
>> - filesystem daemon
>> - filesystem process
>> - server
>>
>> Yes it's also a client of the fuse device, but that term is confusing.
>
> Okay, will do s/client/server/g
>
>>> Anyways, doing it directly from the server (or is it client) opens up a
>>> lot of new possibilities to screw up and I'd really much prefer staying
>>> in similar ballpark with other operations. Maybe we can restrict it to
>>> two stages (query size & transfer) and linear consecutive ranges but
>>> then again adding retry doesn't contribute too much to the complexity.
>>> Oh.. and BTW, the in-ioctl length coding is not used universally, so it
>>> can't be depended upon.
>>
>> I know it's not universal, some horrors I've seen in the old wireless
>> interfaces. The question is: do we want to support such "extended"
>> ioctls? For exmaple, does OSS have non-conformant ioctls?
>
> OSS ioctls are all pretty simple and I think they all use the proper
> encoding. For the question, my answer would be yes (naturally). It
> will suck later when implementing some other device only to find out
> that there's this one ioctl that needs to dereference a pointer but
> there's no supported way to do it but everything else works.
>
> I don't think the performance or the complexity of specific ioctl
> implementation is of the determining importance as long as it can be
> made to work with minimal impact on the rest of the whole thing, so
> the current retry implementation.
>
>>>>> Also, what about containers? How would it work then?
>>>> Dunno. Isn't there some transformation of pids going on, so that the
>>>> global namespace can access pids in all containers but under a
>>>> different alias? I do hope somethinig like this works, otherwise it's
>>>> not only fuse that will break.
>>> I'm not sure either. Any idea who we should be asking about it?
>>
>> Serge Hallyn and Eric Biederman.
>
> Okay, cc'd both. Hello, Eric Biederman, Serge Hallyn. For
> implementing ioctl in FUSE, it's suggested that to access the address
> space of the caller directly from the FUSE server using its pid via
> /proc/pid/mem (or task/tid/mem). It's most likely that the calling
> process's tid will be used. As I don't know much about the
> containers, I'm not sure how such approach will play out when combined
> with containers. Can you enlighten us a bit? W/o containers, it will
> look like the following.
>
>
> FUSE ----------------
> ^ |
> | | kernel
> ------ ioctl ----------- /dev/fuse ------------
> | | userland
> | v
> --------------- -------------
> | caller | | FUSE server |---> reads and writes
> | with tid CTID | | | /proc/PID/task/TID/mem
> --------------- -------------
>
> The FUSE server gets task->pid. IIUC, if the FUSE server is not in a
> container, task->pid should work fine whether the caller is in
> container or not, right? And if the FUSE server is in a container,
> it's hell lot more complex and FUSE may have to map task->pid to what
> FUSE server would know if possible?

Implementation wise it is not too bad.

FUSE ----------------
pid = get_pid(task_tid(current))
^ |
| | kernel
pid_vnr(pid)
------ ioctl ----------- /dev/fuse ------------
| | userland
| v
--------------- -------------
| caller | | FUSE server |---> reads and writes
| with tid CTID | | | /proc/PID/task/TID/mem
--------------- -------------

However it is a largely an insane idea.
- Write is not implemented for /proc/PID/task/TID/mem
- It would be better if the kernel handed you back a file descriptor to the
other process memory rather than you having to generate one.
- To access /proc/PID/task/TID/mem you need to have CAP_PTRACE.
- This seems to allow for random ioctls. With the compat_ioctl thing we have
largely stomped on that idea. So you should only need to deal with well
defined ioctls. At which point why do you need to directly access the memory
of another process.

So why not just only support well defined ioctls and serialize them in the kernel
and allow the receiving process to deserialize them?

That would allow all of this to happen with a non-privileged server which
makes the functionality much more useful.

Given the pain it is to maintain ioctls I would be very surprised if we wanted
to open up that pandoras box even wider by allowing arbitrary user space
processes to support random ioctls. How would you do 32/64bit support
and the like?

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/