Re: selftests/capabilities: test FAIL on linux mainline and linux-next and PASS on linux-4.4.70+

From: Eric W. Biederman
Date: Thu Jun 29 2017 - 12:34:13 EST


Andy Lutomirski <luto@xxxxxxxxxx> writes:

> On Thu, Jun 29, 2017 at 7:23 AM, Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
>>
>>> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>>>>
>>>> Hi Eric-
>>>>
>>>> This is rather odd. The selftest
>>>> (tools/testing/selftests/capabilities/test_execve), run as root, fails
>>>> on current kernels. The failure is worked around by this:
>>>>
>>>> diff --git a/tools/testing/selftests/capabilities/test_execve.c
>>>> b/tools/testing/selftests/capabilities/test_execve.c
>>>> index 10a21a958aaf..6db60889b211 100644
>>>> --- a/tools/testing/selftests/capabilities/test_execve.c
>>>> +++ b/tools/testing/selftests/capabilities/test_execve.c
>>>> @@ -139,8 +139,8 @@ static void chdir_to_tmpfs(void)
>>>> if (chdir(cwd) != 0)
>>>> err(1, "chdir to private tmpfs");
>>>>
>>>> - if (umount2(".", MNT_DETACH) != 0)
>>>> - err(1, "detach private tmpfs");
>>>> +// if (umount2(".", MNT_DETACH) != 0)
>>>> +// err(1, "detach private tmpfs");
>>>> }
>>>>
>>>> static void copy_fromat_to(int fromfd, const char *fromname, const
>>>> char *toname)
>>>>
>>>> I think this is due to the line:
>>>>
>>>> p->mnt_ns = NULL;
>>>>
>>>> in umount_tree(). The test is putting us into a situation in which
>>>> our cwd has ->mnt_ns = NULL, which is making it act as if it's nosuid.
>>>> I can imagine this breaking some weird user code (like my test!). Is
>>>> it a real problem, though?
>>
>> I just wanted to follow up and say this the mnt_may_suid test appears
>> to be doing exactly what it was designed to do.
>>
>> It's goal is not to allow a suid exec from another mount namespace and
>> in this test the umount2(".", MNT_DETACH) creates a poor man's mount
>> namespace.
>>
>> So assuming that we want to not allow execing executables from other
>> mount namespaces the behavior appears to be exactly correct in this
>> case.
>
> Fair enough. Given that the only known failure is my really wonky
> test case, I'll just fix my test.
>
> That being said, I do know of production code that uses MNT_DETACH:
> Sandstorm. It mounts a tmpfs, opens an fd to it, and MNT_DETACHes it.
> That gives it a directory that can't be seen by its children. Using
> mount namespaces for this would be awkward. Admittedly, MNT_DETACH is
> a bit of an awful way to do this -- what it really wants is the
> ability to set up a mount tree that logically belongs to its mount
> namespace but isn't bound anywhere. A couple years ago, we talked
> about adding an API for more or less that: first create a filesystem
> (i.e. superblock) and then bind it in if you want it bound.
>
> But Sandstorm still works, so this isn't a big deal.

If it proved desirable I think we could remove the check_mnt test in
mnt_may_suid. I believe the current_in_userns check actually handles
the namespace confusion case.

At the same time using check_mnt does make it easier to think about
these things. Which if it doesn't limit us in the real world is a plus.

Eric