Re: [PATCH 0/6] mlock: do not hold mmap_sem for extended periods of time

From: Michel Lespinasse
Date: Fri Dec 03 2010 - 18:02:45 EST


Hi David,

I forgot to add you on the original submission, but I think you'd be
best qualified to look at the two patches implementing
rwsem_is_contended()...

On Thu, Dec 2, 2010 at 4:16 PM, Michel Lespinasse <walken@xxxxxxxxxx> wrote:
> Currently mlock() holds mmap_sem in exclusive mode while the pages get
> faulted in. In the case of a large mlock, this can potentially take a
> very long time, during which various commands such as 'ps auxw' will
> block. This makes sysadmins unhappy:
>
> real    14m36.232s
> user    0m0.003s
> sys     0m0.015s
> (output from 'time ps auxw' while a 20GB file was being mlocked without
> being previously preloaded into page cache)
>
> I propose that mlock() could release mmap_sem after the VM_LOCKED bits
> have been set in all appropriate VMAs. Then a second pass could be done
> to actually mlock the pages, in small batches, releasing mmap_sem when
> we block on disk access or when we detect some contention.
>
> Patches are against v2.6.37-rc4 plus my patches to avoid mlock dirtying
> (presently queued in -mm).
>
> Michel Lespinasse (6):
>  mlock: only hold mmap_sem in shared mode when faulting in pages
>  mm: add FOLL_MLOCK follow_page flag.
>  mm: move VM_LOCKED check to __mlock_vma_pages_range()
>  rwsem: implement rwsem_is_contended()
>  mlock: do not hold mmap_sem for extended periods of time
>  x86 rwsem: more precise rwsem_is_contended() implementation

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/