Re: [PATCH] usercopy: use unsigned long instead of uintptr_t

From: Linus Torvalds
Date: Thu Jun 16 2022 - 12:00:16 EST


On Thu, Jun 16, 2022 at 8:21 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> I don't know why people call uintptr_t a "userspace type". It's a type
> invented by C99 that is an integer type large enough to hold a pointer.
> Which is exactly what we want here.

On the other hand, "unsigned long" has existed since the first version
of C, and is an integer type large enough to hold a pointer.

Which is exactly what we want here, and what we use everywhere else too.

The whole "uintptr_t handles the historical odd cases with pointers
that are smaller than a 'long'" is entirely irrelevant, since those
odd cases are just not a factor.

And the "pointers are _larger_ than a 'long'" case is similarly
irrelevant, since we very much want to use arithmetic on these things,
and they are 'long' later. They aren't used as pointers, they are used
as integer indexes into the virtual address space that we do odd
operations on.

Honestly, even if you believe in the 128-bit pointer thing, changing
one cast in one random place to be different from everything else is
*not* productive. We're never going to do some "let's slowly migrate
from one to the other".

And honestly, we're never going to do that using "uintptr_t" anyway,
since it would be based on a kernel config variable and be a very
specific type, and would need to be type-safe for any sane conversion.

IOW, in a hypothetical word where the address size is larger than the
word-size, it would have to be something like out "pte_t", which is
basically wrapped in a struct so that C implicit type conversion
doesn't bite you in the arse.

So no. There is ABSOLUTELY ZERO reason to ever use 'uintptr_t' in the
kernel. It's wrong. It's wrong *even* for actual user space interfaces
where user space might use 'uintptr_t', because those need to be
specific kernel types so that we control them (think for compat
reasons etc).

We use the user-space types in a few places, and they have caused
problems, but at least they are really traditional and the compiler
actually enforces them for some really standard functions. I'm looking
at 'size_t' in particular, which caused problems exactly because it's
a type that is strictly speaking not under our control.

'size_t' is actually a great example of why 'uintptr_t' is a horrid
thing. It's effectively a integer type that is large enough to hold a
pointer difference. On unsegmented architectures, that ends up being a
type large enough to hold a pointer.

Sound familiar?

And does it sound familiar how on some architectures it's "unsigned
int", and on others it is "unsigned long"? It's very annoying, and
it's been annoying over the years. The latest annoyance was literally
less than a week ago in 1c27f1fc1549 ("iov_iter: fix build issue due
to possible type mis-match").

Again, "unsigned long" is superior.

And the only reason to migrate away from it is because you want
something *type-safe*, which uintptr_t very much is not. As
exemplified by size_t, it's the opposite of type-safe. It's actively
likely to be type-confused.

Linus