Re: ll_rw_blk.c BUG

From: Manfred Spraul (
Date: Sat Jun 03 2000 - 14:12:51 EST

Andrea Arcangeli wrote:
> On Sat, 3 Jun 2000, Manfred Spraul wrote:
> >Note we have another race during file unmap, Kanoj Sarcar also tried to
> >fix that race.
> I'm not sure if we should define it a feature. The other race can't cause
> kernel instability or security issues.

Unfortunately it can cause kernel instability:
zap_page_range frees pages before the tlb entries got flushed from all
other cpus.

I wrote
> Because the second cpu can use a stale tlb entry. We must free the page
> _after_ the flush, not before the flush.
> Example:
> cpu1, cpu2: execute 2 threads from one process.
> cpu3: unrelated thread that allocates a new page.
> cpu1:
> 1) writes to one page in a tight loop. The tlb entry won't be discared
> by the cpu without an explicit flush.
> cpu2:
> 2) sys_munmap()
> * zap_page_range(): calls free_page() for each page in the area.
> do_munmap() for a 500 MB block will take a few milliseconds.
> cpu3:
> 3) somewhere: get_free_page(). Now it gets a pointer to a page that cpu1
> still writes to.
> cpu2:
> 4) zap_page_range() returns, now the tlb is flushed.
> cpu1:
> 5) received the ipi, the local tlb is flushed, page fault.
> * but: cpu1 stomped on the page that was allocated by cpu3.

> I guess with sane userspace design
> the other race isn't an issue. Does somebody ever had fs corruption due
> such a race in the ->unmap of a dirty shared mapping?

In the "zap_page_range() ..." thread, Jamie wrote:
> No, you can't ignore it. A variation called mprotect+access is used by
> garbage collection systems that expect to receive SEGVs when access is
> to a protected region.
> At very least, you'd have to document the race very clearly, and provide
> a workaround.


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to Please read the FAQ at

This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 21:00:17 EST