[RFC][PATCH 0/6] IO pinning(get_user_pages()) vs fork race fix
From: KOSAKI Motohiro
Date: Tue Apr 14 2009 - 02:15:59 EST
Linux Device Drivers, Third Edition, Chapter 15: Memory Mapping and DMA says
get_user_pages is a low-level memory management function, with a suitably complex
interface. It also requires that the mmap reader/writer semaphore for the address
space be obtained in read mode before the call. As a result, calls to get_user_pages
usually look something like:
down_read(¤t->mm->mmap_sem);
result = get_user_pages(current, current->mm, ...);
up_read(¤t->mm->mmap_sem);
The return value is the number of pages actually mapped, which could be fewer than
the number requested (but greater than zero).
but, it isn't true. mmap_sem isn't only used for vma traversal, but also prevent vs-fork race.
up_read(mmap_sem) mean end of critical section, IOW after up_read() code is fork unsafe.
(access_process_vm() explain proper get_user_pages() usage)
Oh well, We have many wrong caller now. What is the best fix method?
Nick Piggin and Andrea Arcangeli proposed to change get_user_pages() semantics as caller expected.
see "[PATCH] fork vs gup(-fast) fix" thead in linux-mm
but Linus NACKed it.
Thus I made caller change approach patch series. it is made for discuss to compare Nick's approach.
I don't hope submit it yet.
Nick, This version fixed vmsplice and aio issue (you pointed). I hope to hear your opiniton ;)
ChangeLog:
V2 -> V3
o remove early decow logic
o introduce prevent unmap logic
o fix nfs-directio
o fix aio
o fix bio (only bandaid fix)
V1 -> V2
o fix aio+dio case
TODO
o implement down_write_killable()
o fix kvm (need?)
o fix get_arg_page() (Why this function don't use mmap_sem?)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/