Re: [RFC PATCH] f*xattr: allow O_PATH descriptors

From: Aleksa Sarai
Date: Tue Jun 21 2022 - 22:57:42 EST


On 2022-06-20, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > > > The goal would be that the semantics of fooat(<fd>, AT_EMPTY_PATH) and
> > > > foo(/proc/self/fd/<fd>) should always be identical, and the current
> > > > semantics of /proc/self/fd/<fd> are too leaky so we shouldn't always
> > > > assume that keeping them makes sense (the most obvious example is being
> > > > able to do tricks to open /proc/$pid/exe as O_RDWR).
> > >
> > > Please make a note that I have applications relying on current magic symlink
> > > semantics w.r.t setxattr() and other metadata operations, and the libselinux
> > > commit linked from the patch commit message proves that magic symlink
> > > semantics are used in the wild, so it is not likely that those semantics could
> > > be changed, unless userspace breakage could be justified by fixing a serious
> > > security issue (i.e. open /proc/$pid/exe as O_RDWR).
> >
> > Agreed. We also use magiclinks for similar TOCTOU-protection purposes in
> > runc (as does lxc) as well as in libpathrs so I'm aware we need to be
> > careful about changing existing behaviours. I would prefer to have the
> > default be as restrictive as possible, but naturally back-compat is
> > more important.
> >
> > > > I suspect that the long-term solution would be to have more upgrade
> > > > masks so that userspace can opt-in to not allowing any kind of
> > > > (metadata) write access through a particular file descriptor. You're
> > > > quite right that we have several metadata write AT_EMPTY_PATH APIs, and
> > > > so we can't retroactively block /everything/ but we should try to come
> > > > up with less leaky rules by default if it won't break userspace.
> > >
> > > Ok, let me try to say this in my own words using an example to see that
> > > we are all on the same page:
> > >
> > > - lsetxattr(PATH_TO_FILE,..) has inherent TOCTOU races
> > > - fsetxattr(fd,...) is not applicable for symbolic links
> >
> > While I agree with Christian's concerns about making O_PATH descriptors
> > more leaky, if userspace already relies on this through /proc/self/fd/$x
> > then there's not much we can do about it other than having an opt-out
> > available in openat2(2). Having the option to disable this stuff to
> > avoid making O_PATH descriptors less safe as a mechanism for passing
> > around "capability-less" file handles should make most people happy
> > (with the note that ideally we would not be *adding* capabilities to
> > O_PATH we don't need to).
> >
> > > - setxattr("/proc/self/fd/<fd>",...) is the current API to avoid TOCTOU races
> > > when setting xattr on symbolic links
> > > - setxattrat(o_path_fd, "", ..., AT_EMPTY_PATH) is proposed as a the
> > > "new API" for setting xattr on symlinks (and special files)
> >
> > If this is a usecase we need to support then we may as well just re-use
> > fsetxattr() since it's basically an *at(2) syscall already (and I don't
> > see why we'd want to split up the capabilities between two similar
> > *at(2)-like syscalls). Though this does come with the above caveats that
> > we need to have the opt-outs available if we're going to enshrine this
> > as intentional part of the ABI.
>
>
> Christian preferred that new functionality be added with a new API
> and I agree that this is nicer and more explicit.

Fair enough -- I misread the man page, setxattrat(2) makes more sense.

> The bigger question IMO is, whether fsomething() should stay identical
> to somethingat(,,,AT_EMPTY_PATH). I don't think that it should.
>
> To me, open(path,O_PATH)+somethingat(,,,AT_EMPTY_PATH) is identical
> to something(path) - it just breaks the path resolution and operation to two
> distinguished steps.
>
> fsomething() was traditionally used for "really" open fds, so if we don't need
> to, we better not relax it further by allowing O_PATH, but that's just one
> opinion.

Yeah, you're right -- it would be better to not muddle the two (even
though they are conceptually very similar).

> > > - The new API is going to be more strict than the old magic symlink API
> > > - *If* it turns out to not break user applications, old API can also become
> > > more strict to align with new API (unlikely the case for setxattr())
> > > - This will allow sandboxed containers to opt-out of the "old API", by
> > > restricting access to /proc/self/fd and to implement more fine grained
> > > control over which metadata operations are allowed on an O_PATH fd
> > >
> > > Did I understand the plan correctly?
> >
> > Yup, except I don't think we need setxattrat(2).
> >
> > > Do you agree with me that the plan to keep AT_EMPTY_PATH and magic
> > > symlink semantics may not be realistic?
> >
> > To clarify -- my view is that if any current /proc/self/fd/$n semantic
> > needs to be maintained then I would prefer that the proc-less method of
> > doing it (such as through AT_EMPTY_PATH et al) would have the same
> > capability and semantics. There are some cases where the current
> > /proc/self/fd/$n semantics need to be fixed (such as the /proc/$pid/exe
> > example) and in that case the proc-less semantics also need to be made
> > safe.
> >
> > While I would like us to restrict O_PATH as much as possible, if
> > userspace already depends on certain behaviour then we may not be able
> > to do much about it. Having an opt-out would be very important since
> > enshrining these leaky behaviours (which seem to have been overlooked)
> > means we need to consider how userspace can opt out of them.
> >
> > Unfortunately, it should be noted that due to the "magical" nature of
> > nd_jump_link(), I'm not sure how happy Al Viro will be with the kinds of
> > restrictions necessary. Even my current (quite limited) upgrade-mask
> > patchset has to do a fair bit of work to unify the semantics of
> > magic-links and openat(O_EMPTYPATH) -- expanding this to all *at(2)
> > syscalls might be quite painful. (There are also several handfuls of
> > semantic questions which need to be answered about magic-link modes and
> > whether for other *at(2) operations we may need even more complicated
> > rules or even a re-thinking of my current approach.)
>
> The question remains, regarding the $SUBJECT patch,
> is it fair to block it and deprive libselinux of a non buggy API
> until such time that all the details around masking O_PATH fds
> will be made clear and the new API implemented?
>
> There is no guarantee this will ever happen, so it does not seem
> reasonable to me.
>
> To be a reasonable reaction to the currently broken API is
> to either accept the patch as is or request that setxattrat()
> will be added to provide the new functionality.

Since the current functionality cannot be retroactively disabled as it
is being used already through /proc/self/fd/$n, adding
*xattrat(AT_EMPTY_PATH) doesn't really change what is currently possible
by userspace.

I would say we should add *xattrat(2) and then we can add an upgrade
mask blocking it (and other operations) later.

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature