Re: [RFC] extending splice for copy offloading

From: Ric Wheeler
Date: Thu Sep 26 2013 - 17:28:02 EST


On 09/26/2013 02:55 PM, Zach Brown wrote:
On Thu, Sep 26, 2013 at 10:58:05AM +0200, Miklos Szeredi wrote:
On Wed, Sep 25, 2013 at 11:07 PM, Zach Brown <zab@xxxxxxxxxx> wrote:
A client-side copy will be slower, but I guess it does have the
advantage that the application can track progress to some degree, and
abort it fairly quickly without leaving the file in a totally undefined
state--and both might be useful if the copy's not a simple constant-time
operation.
I suppose, but can't the app achieve a nice middle ground by copying the
file in smaller syscalls? Avoid bulk data motion back to the client,
but still get notification every, I dunno, few hundred meg?
Yes. And if "cp" could just be switched from a read+write syscall
pair to a single splice syscall using the same buffer size. And then
the user would only notice that things got faster in case of server
side copy. No problems with long blocking times (at least not much
worse than it was).
Hmm, yes, that would be a nice outcome.

However "cp" doesn't do reflinking by default, it has a switch for
that. If we just want "cp" and the like to use splice without fearing
side effects then by default we should try to be as close to
read+write behavior as possible. No?
I guess? I don't find requiring --reflink hugely compelling. But there
it is.

That's what I'm really
worrying about when you want to wire up splice to reflink by default.
I do think there should be a flag for that. And if on the block level
some magic happens, so be it. It's not the fs deverloper's worry any
more ;)
Sure. So we'd have:

- no flag default that forbids knowingly copying with shared references
so that it will be used by default by people who feel strongly about
their assumptions about independent write durability.

- a flag that allows shared references for people who would otherwise
use the file system shared reference ioctls (ocfs2 reflink, btrfs
clone) but would like it to also do server-side read/write copies
over nfs without additional intervention.

- a flag that requires shared references for callers who don't want
giant copies to take forever if they aren't instant. (The qemu guys
asked for this at Plumbers.)

I think I can live with that.

- z

This last flag should not prevent a remote target device (NFS or SCSI array) copy from working though since they often do reflink like operations inside of the remote target device....

ric


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/