Re: [patch 05/11] syslets: core code

From: Ingo Molnar
Date: Tue Feb 13 2007 - 17:44:30 EST



* Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> > > > + if (!access_ok(VERIFY_WRITE, arg_ptr, sizeof(*arg_ptr)))
> > > > + return -EFAULT;
> > >
> > > It's a little unclear why you do that many individual access_ok()s.
> > > And why is the target constant sized anyways?
> >
> > each indirect pointer has to be checked separately, before dereferencing
> > it. (Andrew pointed out that they should be VERIFY_READ, i fixed that in
> > my tree)
>
> But why only constant sized? It could be a variable length object,
> couldn't it?

i think what you might be missing is that it's only the 6 syscall
arguments that are fetched via indirect pointers - security checks are
then done by the system calls themselves. It's a bit awkward to think
about, but it is surprisingly clean in the assembly, and it simplified
syslet programming too.

> > get_user_pages() would have to be limited in some way - and i didnt
> > want
>
> If you only use it for a small ring buffer it is naturally limited.

yeah, but 'small' is a dangerous word when it comes to adding IO
interfaces ;-)

> > a single page is enough for 1024 completion pointers - that's more
> > than enough for most purposes - and the default mlock limit is 40K.
>
> Then limit it to a single page and use gup

1024 (512 on 64-bit) is alot but not ALOT. It is also certainly not
ALOOOOT :-) Really, people will want to have more than 512
disks/spindles in the same box. I have used such a beast myself. For Tux
workloads and benchmarks we had parallelism levels of millions of
pending requests (!) on a single system - networking, socket limits,
disk IO combined with thousands of clients do create such scenarios. I
really think that such 'pinned pages' are a pretty natural fit for
sys_mlock() and RLIMIT_MEMLOCK, and since the kernel side is careful to
use the _inatomic() uaccess methods, it's safe (and fast) as well.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/