Re: DRM lock ordering fix series

From: Eric Anholt
Date: Fri Mar 27 2009 - 16:12:59 EST


On Fri, 2009-03-27 at 19:10 +0100, Andi Kleen wrote:
> On Fri, Mar 27, 2009 at 09:36:45AM -0700, Eric Anholt wrote:
> > > > You are aware that there is a fast path now (get_user_pages_fast) which
> > > > is significantly faster? (but has some limitations)
> > >
> > > In the code I have, get_user_pages_fast is just a wrapper that calls the
> > > get_user_pages in the way that I'm calling it from the DRM.
> >
> > Ah, I see: that's a weak stub, and there is a real implementation. I
> > didn't know we could do weak stubs.
>
> The main limitation is that it only works for your current process,
> not another one. For more details you can check the git changelog
> that added it (8174c430e445a93016ef18f717fe570214fa38bf)
>
> And yes it's only faster for architectures that support it, that's
> currently x86 and ppc.

OK. I'm not too excited here -- 10% of 2% of the CPU time doesn't get
me to the 10% loss that the slow path added up to. Most of the cost is
in k{un,}map_atomic of the returned pages. If the gup somehow filled in
the user's PTEs, I'd be happy and always use that (since then I'd have
the mapping already in place and just use that). But I think I can see
why that can't be done.

I suppose I could rework this so that we get_user_pages_fast outside the
lock, then walk doing copy_from_user_inatomic, and fall back to
kmap_atomic of the page list if we fault on the user's address. It's
still going to be a cost in our hot path, though, so I'd rather not.

I'm working on a set of tests and microbenchmarks for GEM, so other
people will be able to play with this easily soon.

--
Eric Anholt
eric@xxxxxxxxxx eric.anholt@xxxxxxxxx


Attachment: signature.asc
Description: This is a digitally signed message part