Re: [PATCH 2/3] riscv: optimized memmove

From: Jisheng Zhang
Date: Wed Jan 31 2024 - 00:38:29 EST


On Tue, Jan 30, 2024 at 06:52:24PM +0200, Nick Kossifidis wrote:
> On 1/30/24 15:12, Jisheng Zhang wrote:
> > On Tue, Jan 30, 2024 at 01:39:10PM +0200, Nick Kossifidis wrote:
> > > On 1/28/24 13:10, Jisheng Zhang wrote:
> > > > From: Matteo Croce <mcroce@xxxxxxxxxxxxx>
> > > >
> > > > When the destination buffer is before the source one, or when the
> > > > buffers doesn't overlap, it's safe to use memcpy() instead, which is
> > > > optimized to use a bigger data size possible.
> > > >
> > > > Signed-off-by: Matteo Croce <mcroce@xxxxxxxxxxxxx>
> > > > Reported-by: kernel test robot <lkp@xxxxxxxxx>
> > > > Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>
> > >
> > > I'd expect to have memmove handle both fw/bw copying and then memcpy being
> > > an alias to memmove, to also take care when regions overlap and avoid
> > > undefined behavior.
> >
> > Hi Nick,
> >
> > Here is somthing from man memcpy:
> >
> > "void *memcpy(void dest[restrict .n], const void src[restrict .n],
> > size_t n);
> >
> > The memcpy() function copies n bytes from memory area src to memory area dest.
> > The memory areas must not overlap. Use memmove(3) if the memory areas do over‐
> > lap."
> >
> > IMHO, the "restrict" implies that there's no overlap. If overlap
> > happens, the manual doesn't say what will happen.
> >
> > From another side, I have a concern: currently, other arch don't have
> > this alias behavior, IIUC(at least, per my understanding of arm and arm64
> > memcpy implementations)they just copy forward. I want to keep similar behavior
> > for riscv.
> >
> > So I want to hear more before going towards alias-memcpy-to-memmove direction.
> >
> > Thanks
>

Hi Nick,

> If you read Matteo's original post that was also his suggestion, and Linus

I did read all discussions in Matteo's v1 ~ v5 before this renew. Per my
understanding, Matteo also concerned no such memcpy-alias-memmove behavior
in other arch's implementations.

> has also commented on that. In general it's better to handle the case where

Linus commented on https://bugzilla.redhat.com/show_bug.cgi?id=638477#c132
about glibc alias memcpy to memove rather than the patch series.

> the regions provided to memcpy() overlap than to resort to "undefined
> behavior", I provided a backwards copy example that you can use so that we
> can have both fw and bw copying for memmove(), and use memmove() in any
> case. The [restrict .n] in the prototype is just there to say that the size
> of src is restricted by n (the next argument). If someone uses memcpy() with

I didn't have c99 spec in hand, but I found gcc explanations about
restrict keyword from [1]:

"the restrict declaration promises that the code will not access that
object in any other way--only through p."

So if there's overlap in memcpy, then it contradicts the restrict
implication.

[1] https://www.gnu.org/software/c-intro-and-ref/manual/html_node/restrict-Pointers.html

And from the manual, if the memcpy users must ensure "The memory areas
must not overlap." So I think all linux kernel's memcpy implementations(only copy
fw and don't take overlap into consideration) are right.

I did see the alias-memcpy-as-memmove in some libc implementations, but
this is not the style in current kernel's implementations.

Given current riscv asm implementation also doesn't do the alias and
copy-fw only, and this series improves performance and doesn't introduce the
Is it better to divide this into two steps: Firstly, merge this series
if there's no obvious bug; secondly, do the alias as you suggested,
since you have a basic implementation, you could even submit your patch
;) What do you think about this two steps solution?

Thanks
> overlapping regions, which is always a possibility, in your case it'll
> result corrupted data, we won't even get a warning (still counts as
> undefined behavior) about it.
>
> Regards,
> Nick
>