RE: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()

From: David Laight
Date: Thu Aug 17 2023 - 11:17:29 EST


From: Linus Torvalds
> Sent: Thursday, August 17, 2023 3:38 PM
>
> On Thu, 17 Aug 2023 at 10:42, David Laight <David.Laight@xxxxxxxxxx> wrote:
> >
> > Although I'm not sure the bit-fields really help.
> > There are 8 bytes at the start of the structure, might as well
> > use them :-)
>
> Actuallyç I wrote the patch that way because it seems to improve code
> generation.
>
> The bitfields are generally all set together as just plain one-time
> constants at initialization time, and gcc sees that it's a full byte
> write.

I've just spent too long on godbolt (again) :-)
Fiddling with:

#define t1 unsigned char
struct b {
t1 b1:7;
t1 b2:1;
};

void ff(struct b *,int);

void ff1(void)
{
struct b b = {.b1=3, .b2 = 1};
ff(&b, sizeof b);
}

gcc for x86-64 make a pigs-breakfast when the bitfields are 'char'
and loads the constant from memory using pc-relative access.
Otherwise pretty must all variants (with or without the bitfield)
get initialised in a single write.
(Although gcc seems to insist on loading a 32bit constant into %eax.)

I can well imagine that keeping the constant below 32768 will help
on architectures that have to construct large constants.

> And the reason 'data_source' is not a bitfield is that it's not
> a constant at iov_iter init time (it's an argument to all the init
> functions), so having that one as a separate byte at init time is good
> for code generation when you don't need to mask bits or anything like
> that.
>
> And once initialized, having things be dense and doing all the
> compares with a bitwise 'and' instead of doing them as some value
> compare again tends to generate good code.
>
> Then being able to test multiple bits at the same time is just gravy
> on top of that (ie that whole "remove user_backed, because it's easier
> to just test the bit combination").

Indeed, they used to be bits but never got tested together.

> > OTOH the 'nofault' and 'copy_mc' flags could be put into much
> > higher bits of a 32bit value.
>
> Once you start doing that, you often get bigger constants in the code stream.

I wasn't thinking of using 'really big' values :-)
Even 32768 can be an issue because some cpu sign extend all constants.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)