Inter-process send()/recv() using zero-copy ?

From: Xavier Roche
Date: Wed Sep 23 2009 - 02:09:16 EST


Hi folks,

I was wondering if there was a way to have zero-copy send()/recv(), when the socket is connected to the local machine (to another process on the same machine, for example) ?

Such feature would be only feasible with page-aligned blocks, from an a mmap'ed block to another one, I guess.

Typical case:

Process #1 (uid A)
buff = mmap(0, size, ..) /* anonymous or not */
...
send(s, buff, size, 0)
munmap(buff, size)

Process #2 (uid B)
buff = mmap(0, size, .. | MAP_ANONYMOUS, ..)
recv(s, buff, size, 0)

In an ideal fantasy world, the first process would use send() to transmit the complete page-aligned memory block to the other side, and the second process would use recv() to get the memory block on a similar anonymously mmap'ed block, and the only operation the kernel would do would be to share the memory block between the two processes with copy-on-write.

On the real world, the same operation requires a first read of the whole memory block (possibly partially on disk) and a complete write (possibly partially on disk, too) with two copies of the same memory region at the end.

Two solutions can be used to emulate such feature:

1. use a temporary mmap'ed file
- but requires a temporary file
- permissions for the file ? (not necessarily from the same UID)
- special case for local network block transmissions vs. machine-to-machine

2. use shared memory explicitely
- handling of permissions ? (ditto)
- special case for local network block transmissions vs. machine-to-machine

splice() and friends do not appear to give any help for this case, and I was wondering if there was a chance to do that ?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/