Re: [PATCH] [iov_iter] use memmove() when copying to/from user page

From: Al Viro
Date: Tue May 16 2017 - 18:49:06 EST


On Tue, May 16, 2017 at 03:15:16PM -0700, Dmitry Vyukov wrote:
> > Because it's not going to be *one* call of memcpy() or memmove(). It's
> > one per page.
>
>
> I missed that.
>
> I assumed that in the case of sendfile from memfd to memfd data will
> be copied directly. But it goes through a pipe with multiple buffers.
> Does not look easily fixable.

Which leaves us only with "will nasal demons really fly there?".

Note, BTW, that memmove() warranties in libc (and in kernel) are somewhat
weaker than what C99 promises - if the same page is mmapped at two
addresses, memmove() between the pointers in those area does not guarantee
what 7.21.2.2 says ("Copying takes place as if the n characters from the
object pointed to by s2 are first copied into a temporary array of n
characters that does not overlap the objects pointed to by s1 and s2,
and then the n characters from the temporary array are copied into the
object pointed to by s1"). It's out of scope for C99, but SUS has memmove()
definition not only copying that from C99, but explicitly deferring to it
and saying that any differences are unintentional. And mmap() *is* within
the scope of SUS. There's no practical way to get C99-compliant behaviour
when such aliases are possible, of course - nothing short of bounce buffers
will do. The thing is, we can't assume their absense - the copying requested
in copy_to_iter/copy_from_iter can bloody well be between different virtual
addresses of the same page. Including the case when one of the aliases is
in kernel space and another - in userland. IOW, copy_from_user() and its
ilk really can have overlaps between source and destination. When called
by read() and write().

Consider the case of write() from an mmapped piece of file to overlapping
piece of the same file. It is possible and not hard to trigger; all we
can guarantee is the lack of infoleaks, filesystem corruption or memory
corruption. File *contents* in the affected area can't be sanely relied
upon.

This case is not different. BTW, neither SUS, nor our manpages for
write(2) mention these issues with mmap()-created aliases between the
source and destination.