Re: anon-vma (and now filebacked-mappings too) mprotect vma merging [Re: 2.6.5-rc2-aa vma merging]

From: Andrea Arcangeli
Date: Sat Apr 03 2004 - 21:00:51 EST


On Sat, Apr 03, 2004 at 05:43:12PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@xxxxxxx> wrote:
> >
> > On Sat, Apr 03, 2004 at 05:13:58PM -0800, Andrew Morton wrote:
> > > That change was made for scheduling latency reasons, mainly due to huge
> > > pagetable walks in zap_page_range():
> > >
> > > i_shared_lock is held for a very long time during vmtruncate() and
> > > causes high scheduling latencies when truncating a file which is
> > > mmapped. I've seen 100 milliseconds.
> > >
> > > So turn it into a semaphore. It nests inside mmap_sem. This
> > > change is also needed by the shared pagetable patch, which needs to
> > > unshare pte's on the vmtruncate path - lots of pagetable pages need to
> > > be allocated and they are using __GFP_WAIT.
> > >
> > > If we no longer hold that lock during pte takedown then sure, it would
> > > be better if we had a spinlock in there.
> >
> > agreed, vmtruncate may still be very costly due the pte scan, so
> > I was probably wrong about not needing the semaphore anymore with
> > prio-tree (I was too objrmap centric, and after all objrmap with the
> > trylocks threats it like a spinlock anyways), though the semaphore
> > cannot help the latency of the non-contention case (I mean with preempt
> > disabled), that seem to have room for improvements with a cond_sched()
> > before returning from invalidate_mmap_range_list not present yet.
>
> <looks>

ok so it's implemented already, actually w/o prio-tree you'd waste some
cpu w/o schedule points if all vmas are out of range, with prio-tree
that's fixed.

>
> Actually I was a bit harsh to the !CONFIG_PREEMPT case in unmap_vmas():
>
> /* No preempt: go for the best straight-line efficiency */
> #if !defined(CONFIG_PREEMPT)
> #define ZAP_BLOCK_SIZE (~(0UL))
> #endif
>
> We should probably make that 1024*PAGE_SIZE or something like that.

agreed, more specifically we shouldn't differentiate the preempt from
non-preempt case, I'd suggest removing those CONFIG_PREEMPT there, it's
a misconception to think people disabling preempt cares about worst case
latencies less than people enabling preempt, infact the people disabling
preempt _only_ cares about the worst case latency ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/