Re: Going up the call stack on a system call

Andi Kleen (ak@muc.de)
Sun, 21 Nov 1999 23:40:01 +0100


On Sun, Nov 21, 1999 at 11:04:37PM +0100, Karim Yaghmour wrote:
> Andi Kleen wrote:
> >
> > It'll cause a kernel oops when the address is not mapped.
> > You need the special exception handlers the *_user functions set up
> > to catch it.
> > Also accessing the user space directly this way is not portable (e.g.
> > it won't work on the m68k or sparc port, it only works by accident in the
> > current x86 port because the user space is 1:1 mapped)
> >
> > The rule is simple: for all user space accesses use *_user(). I don't know
> > why you don't want it.
>
> It's not a question of not wanting it. It's a question of
> understanding how the kernel works. That's all. I just
> wanted to make sure I understood all the implications.

Documentation/exception.txt is a good doc in how *_user works.

>
> You do make a point, though, this is highly unportable.

It is also unsafe and racy.

First *_user is an abstraction. It hides the specifics on how the user
space is mapped and accessed etc. from most of the kernel code. You should
not break that abstraction with a (very) good reason.

The other point of *_user on x86 is that it guards you against races
and bad addressed passed from the user. As soon as you access user mapped
memory the kernel may sleep and schedule in a page fault. While your
thread is sleeping another kernel thread that shares the same address space
could call munmap on any of the addresses you're trying to access. The
kernel would oops/crash then. *_user handles this case properly.

> I hadn't given it much thought since the code I showed
> you comes from a modified arch/i386/kernel/traps.c.
> Therefore, I didn't give the other architectures much
> thought, for now that is. Moreover, the function from
> which this code comes from is called by a modified
> arch/i386/kernel/entry.S. The bottom reason is that I
> don't have access to machines other than i386, for now.

Even on x86 it may not be safe, e.g. on a machine with SGI's big memory
patch applied it would not work (because they do mapping tricks in
*_user)

>
> I'll use the following code instead :
>
> /* Keep on going until we reach the end of the process' stack limit (wherever it may be) */
> while( (((unsigned long)stack > current->mm->start_data)
> &&((unsigned long)stack < current->mm->end_data))
> ||(((unsigned long)stack > current->mm->start_brk)
> &&((unsigned long)stack < current->mm->brk))
> ||(((unsigned long)stack > current->mm->start_stack - current->rlim[RLIMIT_STACK].rlim_cur)
> &&((unsigned long)stack < current->mm->start_stack)))
> {
> /* Get the current "address" off the stack, if it is addressable */
> if(get_user(addr, stack)) goto prepare_end;
>
> Is this OK?

Looks better, although I'm not sure why you're checking for the data segment.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/