Re: [PATCH v4 1/2] vfio/type1: Remove locked page accounting workqueue

From: Peter Xu
Date: Mon Apr 17 2017 - 22:55:06 EST


On Mon, Apr 17, 2017 at 03:32:20PM -0600, Alex Williamson wrote:
> On Tue, 18 Apr 2017 01:02:12 +0530
> Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
>
> > On 4/18/2017 12:49 AM, Alex Williamson wrote:
> > > On Tue, 18 Apr 2017 00:35:06 +0530
> > > Kirti Wankhede <kwankhede@xxxxxxxxxx> wrote:
> > >
> > >> On 4/17/2017 8:02 PM, Alex Williamson wrote:
> > >>> On Mon, 17 Apr 2017 14:47:54 +0800
> > >>> Peter Xu <peterx@xxxxxxxxxx> wrote:
> > >>>
> > >>>> On Sun, Apr 16, 2017 at 07:42:27PM -0600, Alex Williamson wrote:
> > >>>>
> > >>>> [...]
> > >>>>
> > >>>>> -static void vfio_lock_acct(struct task_struct *task, long npage)
> > >>>>> +static int vfio_lock_acct(struct task_struct *task, long npage, bool lock_cap)
> > >>>>> {
> > >>>>> - struct vwork *vwork;
> > >>>>> struct mm_struct *mm;
> > >>>>> bool is_current;
> > >>>>> + int ret;
> > >>>>>
> > >>>>> if (!npage)
> > >>>>> - return;
> > >>>>> + return 0;
> > >>>>>
> > >>>>> is_current = (task->mm == current->mm);
> > >>>>>
> > >>>>> mm = is_current ? task->mm : get_task_mm(task);
> > >>>>> if (!mm)
> > >>>>> - return; /* process exited */
> > >>>>> + return -ESRCH; /* process exited */
> > >>>>>
> > >>>>> - if (down_write_trylock(&mm->mmap_sem)) {
> > >>>>> - mm->locked_vm += npage;
> > >>>>> - up_write(&mm->mmap_sem);
> > >>>>> - if (!is_current)
> > >>>>> - mmput(mm);
> > >>>>> - return;
> > >>>>> - }
> > >>>>> + ret = down_write_killable(&mm->mmap_sem);
> > >>>>> + if (!ret) {
> > >>>>> + if (npage < 0 || lock_cap) {
> > >>>>
> > >>>> Nit: maybe we can avoid passing in lock_cap in all the callers of
> > >>>> vfio_lock_acct() and fetch it via has_capability() only if npage < 0?
> > >>>> IMHO that'll keep the vfio_lock_acct() interface cleaner, and we won't
> > >>>> need to pass in "false" any time when doing unpins.
> > >>>
> > >>> Unfortunately vfio_pin_pages_remote() needs to know about lock_cap
> > >>> since it tests whether the user is exceeding their locked memory
> > >>> limit. The other callers could certainly get away with
> > >>> vfio_lock_acct() testing the capability itself but that would add a
> > >>> redundant call for the most common user. I'm not a big fan of passing
> > >>> a lock_cap bool either, but it seemed the best fix for now. The
> > >>> cleanest alternative I can up with is this (untested):
> > >>>
> > >>
> > >> In my opinion, passing 'bool lock_cap' looks much clean and simple.
> > >>
> > >> Reviewed-by: Kirti Wankhede <kwankhede@xxxxxxxxxx>
> > >
> > > Well shoot, I was just starting to warm up to the bool*. I like that
> > > we're not presuming the polarity for the callers we expect to be
> > > removing pages and I generally just dislike passing fixed bool
> > > parameters to change the function behavior. I've cleaned it up a bit
> > > further and was starting to do some testing on this which I'd propose
> > > for v5. Does it change your opinion?
> >
> > If passing fixed bool parameter is the concern then I would lean towards
> > Peter's suggestion. vfio_pin_pages_remote() will check lock capability
> > outside vfio_lock_acct() and again in vfio_lock_acct(). At other places,
> > it will be takes care within vfio_lock_acct()
>
> Sorry, I don't see that as a viable option. Testing for CAP_IPC_LOCK in
> both vfio_pin_pages_remote() and vfio_lock_acct() results in over a
> 10% performance hit on the mapping path with a custom micro-benchmark.
> In fact, it suggests we should probably pass that from even higher in
> the call stack. Thanks,

Sorry I wasn't aware of such a performance degradation with such a
change. Then I would be perfectly fine with either current patch, or
the new one you proposed (with bool *). Thanks,

--
Peter Xu