Re: Linux 1.3.36: 'bforget' is flawed

Michael Elizabeth Chastain (mec@duracef.shout.net)
Mon, 6 Nov 1995 08:34:55 -0600


Linus Torvalds writes:

> _However_, the mmap exit code actually does things the wrong way. It
> _first_ releases the inode, and only _then_ it unmaps the (potentially
> shared) pages.

I was wondering about this! I will try re-ordering this today and see
if it helps. If it does, I will send a patch. Of course you may beat
me to it and your patches have more authority than mine. :)

> Actually, "bforget()" assumes that the buffers _can't_ be shared with
> memory maps, and that is enforced by "truncate()" etc with the
> "vmtruncate()" code. Similarly, when removing the file with "unlink()",
> the file is actually removed only after it is no longer used, so there
> cannot in theory be any shared pages.

Check me on this, please:

In ext2, I see the following call stack:

ext2_unlink
iput
ext2_put_inode
ext2_truncate
trunc_direct and trunc_idirect
bforget

'vmtruncate' is not called in this path.

But in this call stack, 'iput' does not call 'ext2_put_inode' until
'inode->i_count' goes to zero. So it's ok for 'ext2_put_inode' to call
'ext2_truncate' without calling 'vmtruncate'. And it's really bad for
'unmap_fixup' and 'exit_mmap' to be calling 'iput' while they still have
pages which could be shared with that inode in the buffer cache.

Michael Chastain
mec@duracef.shout.net
ftp://tsx-11.mit.edu/pub/linux/sources/usr.bin/mec-0.2.tar.gz