Re: [PATCH] mm/secretmem: use refcount_t instead of atomic_t

From: Dmitry Vyukov
Date: Thu Oct 21 2021 - 05:00:24 EST


On Tue, 24 Aug 2021 at 16:06, Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> On Thu, Aug 19, 2021 at 10:33:49PM -0700, Kees Cook wrote:
> > On Fri, Aug 20, 2021 at 06:33:38AM +0200, Jordy Zomer wrote:
> > > When a secret memory region is active, memfd_secret disables
> > > hibernation. One of the goals is to keep the secret data from being
> > > written to persistent-storage.
> > >
> > > It accomplishes this by maintaining a reference count to
> > > `secretmem_users`. Once this reference is held your system can not be
> > > hibernated due to the check in `hibernation_available()`. However,
> > > because `secretmem_users` is of type `atomic_t`, reference counter
> > > overflows are possible.
> >
> > It's an unlikely condition to hit given max-open-fds, etc, but there's
> > no reason to leave this weakness. Changing this to refcount_t is easy
> > and better than using atomic_t.
> >
> > Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
> >
> > > As you can see there's an `atomic_inc` for each `memfd` that is opened
> > > in the `memfd_secret` syscall. If a local attacker succeeds to open 2^32
> > > memfd's, the counter will wrap around to 0. This implies that you may
> > > hibernate again, even though there are still regions of this secret
> > > memory, thereby bypassing the security check.
> >
> > IMO, this hibernation check is also buggy, since it looks to be
> > vulnerable to ToCToU: processes aren't frozen when
> > hibernation_available() checks secretmem_users(), so a process could add
> > one and fill it before the process freezer stops it.
> >
> > And of course, there's still the ptrace hole[1], which is think is quite
> > serious as it renders the entire defense moot.
>
> I thought about what can be done here and could not come up with anything
> better that prevent PTRACE on a process with secretmem, but this seems to
> me too much from usability vs security POV.
>
> Protecting against root is always hard and secretmem anyway does not
> provide 100% guarantee by itself but rather makes an accidental data leak
> or non-target attack much harder.
>
> To be effective it also presumes that other hardening features are turned
> on by the system administrator on production systems, so it's not
> unrealistic to rely on ptrace being disabled.

Hi,

The issue existed before this change, but I think refcount_inc needs
to be done before fd_install. After fd_install finishes, the fd can be
used by userspace and we can have secret data in memory before the
refcount_inc.

A straightforward mis-use where a user will predict the returned fd in
another thread before the syscall returns and will use it to store
secret data is somewhat dubious because such a user just shoots
themself in the foot.

But a more interesting mis-used would be to close the predicted fd and
decrement the refcount before the corresponding refcount_inc, this way
one can briefly drop the refcount to zero while there are other users
of secretmem.