Re: [PATCH] prctl: remove one-shot limitation for changing exe link

From: Eric W. Biederman
Date: Sat Jul 30 2016 - 13:45:03 EST


Cyrill Gorcunov <gorcunov@xxxxxxxxx> writes:

> On Mon, Jul 25, 2016 at 02:56:43PM -0500, Eric W. Biederman wrote:
> ...
>> >>
>> >> Also there is a big fat bug in prctl_set_mm_exe_file. It doesn't
>> >> validate that the new file is a actually mmaped executable. We would
>> >> definitely need that to be fixed before even considering removing the
>> >> limit.
>> >
>> > Could you please elaborate? We check for inode being executable,
>> > what else needed?
>>
>> That the inode is mmaped into the process with executable mappings.
>>
>> Effectively what we check the old mapping for and refuse to remove the old
>> mm_exe_file if it exists.
>>
>> I think a reasonable argument can be made that if the file is
>> executable, and it is mmaped with executable pages that exe_file is not
>> a complete lie.
>
> I might be missing something obvious, so sorry for the question --
> when criu setups old exe link the inode we obtain from file open
> is not mapped into memory, the old exe not read by anyone because
> it's not even executed anyhow. So I don't really understand which
> mapping we should check here. Mind to point me?

That sounds like an out and out bug that should not be preserved.
Of course we should mmap the executable and set it up so that it can be
executed (at least as much as the executable was previously mapped).
Anything else is a buggy restart, and lying to userspace.

>> Which is the important part. At the end of the day how much can
>> userspace trust /proc/pid/exe? If we are too lax it is just a random
>> file descriptor we can not trust at all. At which point there is
>> exactly no point in preserving it in checkpoint/restart, because nothing
>> will trust or look at it.
>
> You know, I think we should not trust exe link much, and in real we
> never could: this link is rather a hint pointing which executable a
> process has been using on execve call, once the process start working
> one can't be sure if the code currently running is exactly from the
> file pointed by exe link. It just a hint suitable for debuggin and
> obtain clean view of which processes are running on noncompromised
> system. Monitoring exe link change won't help much if there are
> malicious software running on the system.

But it is not just a hint. It is a record of which executable we called
execve on. Knowing which file was executed doesn't guarantee what is
running now but it provides a very strong hint.

At then end of a restart the state of a process should be (by
definition) exactly the state the process was before a checkpoint
and thus a state the original executable could have gotten into.

I admit it is possible for an application to unmap itself. I honestly
have not met that application (except perhaps criu).

>> If the only user is checkpoint/restart perhaps it should be only ptrace
>> that can set this and not the process itself with a prctl. I don't
>> know. All I know is that we should work on making it a very trustable
>> value even though in some specific instances we can set it.
>
> Since as I said I suppose nobody except us using this feature, we can
> setup some sysctl trigger for it (I personally think this is an
> overkill, but OTOH if people rely on the exe link and not going
> to use criu at all, this trigger will help).

Some clarity of thought came to me, and I apologise for not replying
sooner with it sooner.

My problem with the original patch submission is that it was
justifying changing prctl_set_mm_exe_file based on what
prctl_set_mm_exe_file does today. As prctl_set_mm_exe_file was added
for the checkpoint/restart case that is justifying changing code based
on a buggy implementation.

It is necessary to look at the ordinary situation. Without
prctl_set_mm_exe /proc/[pid]/exe can be counted on as a record
of which executable was last passed to execve. Furthermore the state of
a process can be counted on to be a state reachable from calling
execve on /proc/[pid]/exe.

Which means to preserve those expectations prctl_set_mm_exe_file should
in practice just be a nicer less cumbersome interface to things you can
already achieve with execve.

Justifying removale of the one-short nature for prctl_set_mm_exe_file
is as straight forward as noting that a process can call execve on
any executable file.

However when I compare the invariants that execve has on a file (such as
the executable being mmaped) I see some noticable disparities between
what prctl_set_mm_exe_file allows and what execve allows. With
prctl_set_mm_exe being less strict.

So what I am requesting is very simple. That the checks in
prctl_set_mm_exe_file be tightened up to more closely approach what
execve requires. Thus preserving the value of the /proc/[pid]/exe for
the applications that want to use the exe link.

Once the checks in prctl_set_mm_exe_file are tightened up please feel
free to remove the one shot test.

Eric