Re: Compat syscall instrumentation and return from execve issue

From: Steven Rostedt
Date: Mon Nov 09 2015 - 16:12:24 EST


On Mon, 9 Nov 2015 12:57:06 -0800
Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> > The solution I suggested wouldn't touch any asm code. The only change
> > would be to reserve the TS_EXECVE flag. Actually, come to think of it,
> > we could have Mathieu's TS_ORIG_COMPAT flag, and still only have the
> > tracepoint syscall set it, such that the matching tracepoint syscall
> > exit would know that the initial call was COMPAT or not.
>
> Someone needs to clear TS_EXECVE, though.

Well, it gets set and cleared by the syscall enter (same for
TS_ORIG_COMPAT), and exit for that matter.

It's trivial to have a tracepoint hook added when either system call
enter or exit tracepoints are enabled. Thus, the setting and clearing of
the flag can be done by another callback at those tracepoints.

>
> >
> > The goal is only to make sure that the system call exit tracepoint
> > matches the system call enter tracepoint.
> >
> > The system call enter would set or clear the TS_ORIG_COMPAT if the
> > TS_COMPAT is set when entering the system call, and it would check that
> > flag when exiting the system call.
>
> This seems a bit odd, though, since we aren't very good about
> preserving the syscall nr or the args through syscall processing. In
> any event, in the new improved x86 syscall code, we know what arch we
> are just by following the control flow, so no flags should be needed.
> Hence my suggestion of just adding an "unsigned int arch" to the
> return slowpath.

I guess I don't understand this "unsigned int arch".

When the execve system call is called, it's running in x86_64 mode, and
then the execve changes the state to ia32 bit mode. Then on return, the
tracepoint system call exit, has the x86_64 system call number, but if
it checks to see what state the task is in, it will see ia32 state, and
then report the number for ia32 instead.

For example, in x86_64, execve is 59, and that number is passed to the
system call enter tracepoint. Now on return of the system call, the
system call exit tracepoint gets called with 59 as the system call as
well, but if that tracepoint checks the state, it will think its
returning the "olduname" system call (that's 59 for ia32).

What change are you making to solve this?

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/