Re: [RFC][PATCH] fix short copy handling in copy_mc_pipe_to_iter()

From: Al Viro
Date: Mon Jun 13 2022 - 18:47:51 EST


On Mon, Jun 13, 2022 at 10:54:36AM -0700, Linus Torvalds wrote:
> On Sun, Jun 12, 2022 at 5:10 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Unlike other copying operations on ITER_PIPE, copy_mc_to_iter() can
> > result in a short copy. In that case we need to trim the unused
> > buffers, as well as the length of partially filled one - it's not
> > enough to set ->head, ->iov_offset and ->count to reflect how
> > much had we copied. Not hard to fix, fortunately...
> >
> > I'd put a helper (pipe_discard_from(pipe, head)) into pipe_fs_i.h,
> > rather than iov_iter.c -
>
> Actually, since this "copy_mc_xyz()" stuff is going to be entirely
> impossible to debug and replicate for any normal situation, I would
> suggest we take the approach that we (long ago) used to take with
> copy_from_user(): zero out the destination buffer, so that developers
> that can't test the faulting behavior don't have to worry about it.
>
> And then the existing code is fine: it will break out of the loop, but
> it won't do the odd revert games and the "randomnoise.len -= rem"
> thing that I can't wrap my head around.
>
> Hmm?

Not really - we would need to zero the rest of those pages somehow.
They are already allocated and linked into pipe; leaving them
there (and subsequent ones hadn't seen any stores whatsoever - they
are fresh out of alloc_page(GFP_USER)) is a non-starter.

We could do allocation as we go, but that's a much more intrusive
change...

BTW, speaking of pipes:
static inline unsigned int pipe_space_for_user(unsigned int head, unsigned int tail,
struct pipe_inode_info *pipe)
{
unsigned int p_occupancy, p_space;

p_occupancy = pipe_occupancy(head, tail);
if (p_occupancy >= pipe->max_usage)
return 0;
p_space = pipe->ring_size - p_occupancy;
if (p_space > pipe->max_usage)
p_space = pipe->max_usage;
return p_space;
}

OK, if head - tail >= max_usage, we get 0. Fair enough, since
pipe_full() callers will get "it's full, sod off" in that situation.
But... what the hell is the rest doing? p_space is the amount of
slots not in use. So we return the lesser of it and max_usage?

Suppose we have 128 slots in the ring, with max_usage being below
that (e.g. 64). 63 slots are in use; you can add at most one.
And p_space is 65, so this sucker will return 64.

Dave, could you explain what's going on there? Note that pipe_write()
does *not* use that thing at all; it's only splice (i.e. ITER_PIPE
stuff) that is using it.

What's wrong with
p_occupancy = pipe_occupancy(head, tail);
if (p_occupancy >= pipe->max_usage)
return 0;
else
return pipe->max_usage - p_occupancy;

which would match the way you are using ->max_usage in pipe_write()
et.al. Including the use in copy_page_to_iter_pipe(), BTW...