Re: [PATCH] x86: only use ERMS for user copies for larger sizes

From: Linus Torvalds
Date: Fri Jan 04 2019 - 21:38:59 EST


Coming back to this old thread, because I've spent most of the day
resurrecting some of my old core x86 patches, and one of them was for
the issue David Laight complained about: horrible memcpy_toio()
performance.

Yes, I should have done this before the merge window instead of at the
end of it, but I didn't want to delay things for yet another release
just because it fell through some cracks.

Anyway, it would be lovely to hear whether memcpy_toio() now works
reasonably. I just picked our very old legacy function for this, so it
will do things in 32-bit chunks (even on x86-64), and I'm certainly
open to somebody doing something smarter, but considering that nobody
else seemed to show any interest in this at all, I just went
"whatever, good enough".

I tried to make it easy to improve on things if people want to.

The other ancient patch I resurrected was the old "use asm goto for
put_user" which I've had in a private branch for the last almost three
years.

I've tried to test this (it turns out I had completely screwed up the
32-bit case for put_user, for example), but I only have 64-bit user
space, so the i386 build ended up being just about building and
looking superficially at the code generation in a couple of places.

More testing and comments appreciated.

Now I have no ancient patches in any branches, or any known pending
issue. Except for all the pull requests that are piling up because I
didn't do them today since I was spending time on my own patches.

Oh well. There's always tomorrow.

Linus

On Mon, Nov 26, 2018 at 2:26 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Linus Torvalds
> > Sent: 23 November 2018 16:36
> ...
> > End result: we *used* to do this right. For the last eight years our
> > "memcpy_{to,from}io()" has been entirely broken, and apparently even
> > the people who noticed oddities like David, never reported it as
> > breakage but instead just worked around it in drivers.
>
> I've mentioned it several times...
>
> Probably no one else noticed lots of single byte transfers while
> testing a TLP monitor he was writing for an FPGA :-)
> They are far too expensive to buy, and would never be connected
> to the right system at the right time - so we (I) wrote one.
>
> Unfortunately we don't really get to see what happens when the
> link comes up (or rather doesn't come up). We only get the
> LTSSM state transitions.
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)