Re: [PATCH v2] creds: Convert cred.usage to refcount_t

From: Kees Cook
Date: Fri Aug 18 2023 - 14:47:28 EST


On Fri, Aug 18, 2023 at 10:55:42AM -0700, Andrew Morton wrote:
> On Thu, 17 Aug 2023 21:17:41 -0700 Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> > From: Elena Reshetova <elena.reshetova@xxxxxxxxx>
> >
> > atomic_t variables are currently used to implement reference counters
> > with the following properties:
> > - counter is initialized to 1 using atomic_set()
> > - a resource is freed upon counter reaching zero
> > - once counter reaches zero, its further
> > increments aren't allowed
> > - counter schema uses basic atomic operations
> > (set, inc, inc_not_zero, dec_and_test, etc.)
> >
> > Such atomic variables should be converted to a newly provided
> > refcount_t type and API that prevents accidental counter overflows and
> > underflows. This is important since overflows and underflows can lead
> > to use-after-free situation and be exploitable.
>
> ie, if we have bugs which we have no reason to believe presently exist,
> let's bloat and slow down the kernel just in case we add some in the
> future? Or something like that. dangnabbit, that refcount_t.
>
> x86_64 defconfig:
>
> before:
> text data bss dec hex filename
> 3869 552 8 4429 114d kernel/cred.o
> 6140 724 16 6880 1ae0 net/sunrpc/auth.o
>
> after:
> text data bss dec hex filename
> 4573 552 8 5133 140d kernel/cred.o
> 6236 724 16 6976 1b40 net/sunrpc/auth.o
>
>
> Please explain, in a non handwavy and non cargoculty fashion why this
> speed and space cost is justified.

Since it's introduction, refcount_t has found a lot of bugs. This is easy
to see even with a simplistic review of commits:

$ git log --date=short --pretty='format:%ad %C(auto)%h ("%s")' \
--grep 'refcount_t:' | \
cut -d- -f1 | sort | uniq -c
1 2016
15 2017
9 2018
23 2019
24 2020
18 2021
24 2022
10 2023

It's not really tapering off, either. All of these would have been silent
memory corruptions, etc. In the face of _what_ is being protected,
"cred" is pretty important for enforcing security boundaries, etc,
so having it still not protected is a weird choice we've implicitly
made. Given cred code is still seeing changes and novel uses (e.g.
io_uring), it's not unreasonable to protect it from detectable (and
_exploitable_) problems.

While the size differences look large in cred.o, it's basically limited
only to cred.o:

text data bss dec hex filename
30515570 12255806 17190916 59962292 392f3b4 vmlinux.before
30517500 12255838 17190916 59964254 392fb5e vmlinux.after

And we've even seen performance _improvements_ in some conditions:
https://lore.kernel.org/lkml/20200615005732.GV12456@shao2-debian/

Looking at confirmed security flaws, exploitable reference counting
related bugs have dropped significantly. (The CVE database is irritating
to search, but most recent refcount-related CVEs are due to counts that
are _not_ using refcount_t.)

I'd rather ask the question, "Why should we _not_ protect cred lifetime
management?"

-Kees

--
Kees Cook