Re: selftests/capabilities: test FAIL on linux mainline and linux-next and PASS on linux-4.4.70+

From: Andy Lutomirski
Date: Thu Jun 29 2017 - 11:40:31 EST


On Thu, Jun 29, 2017 at 7:23 AM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
>
>> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>>>
>>> Hi Eric-
>>>
>>> This is rather odd. The selftest
>>> (tools/testing/selftests/capabilities/test_execve), run as root, fails
>>> on current kernels. The failure is worked around by this:
>>>
>>> diff --git a/tools/testing/selftests/capabilities/test_execve.c
>>> b/tools/testing/selftests/capabilities/test_execve.c
>>> index 10a21a958aaf..6db60889b211 100644
>>> --- a/tools/testing/selftests/capabilities/test_execve.c
>>> +++ b/tools/testing/selftests/capabilities/test_execve.c
>>> @@ -139,8 +139,8 @@ static void chdir_to_tmpfs(void)
>>> if (chdir(cwd) != 0)
>>> err(1, "chdir to private tmpfs");
>>>
>>> - if (umount2(".", MNT_DETACH) != 0)
>>> - err(1, "detach private tmpfs");
>>> +// if (umount2(".", MNT_DETACH) != 0)
>>> +// err(1, "detach private tmpfs");
>>> }
>>>
>>> static void copy_fromat_to(int fromfd, const char *fromname, const
>>> char *toname)
>>>
>>> I think this is due to the line:
>>>
>>> p->mnt_ns = NULL;
>>>
>>> in umount_tree(). The test is putting us into a situation in which
>>> our cwd has ->mnt_ns = NULL, which is making it act as if it's nosuid.
>>> I can imagine this breaking some weird user code (like my test!). Is
>>> it a real problem, though?
>
> I just wanted to follow up and say this the mnt_may_suid test appears
> to be doing exactly what it was designed to do.
>
> It's goal is not to allow a suid exec from another mount namespace and
> in this test the umount2(".", MNT_DETACH) creates a poor man's mount
> namespace.
>
> So assuming that we want to not allow execing executables from other
> mount namespaces the behavior appears to be exactly correct in this
> case.

Fair enough. Given that the only known failure is my really wonky
test case, I'll just fix my test.

That being said, I do know of production code that uses MNT_DETACH:
Sandstorm. It mounts a tmpfs, opens an fd to it, and MNT_DETACHes it.
That gives it a directory that can't be seen by its children. Using
mount namespaces for this would be awkward. Admittedly, MNT_DETACH is
a bit of an awful way to do this -- what it really wants is the
ability to set up a mount tree that logically belongs to its mount
namespace but isn't bound anywhere. A couple years ago, we talked
about adding an API for more or less that: first create a filesystem
(i.e. superblock) and then bind it in if you want it bound.

But Sandstorm still works, so this isn't a big deal.

--Andy