Re: [PATCH, RFC] check for frozen filesystems in the mmap path

From: Eric Sandeen
Date: Tue Apr 21 2009 - 11:16:04 EST


KOSAKI Motohiro wrote:
>
>> Index: linux-2.6/mm/memory.c
>> ===================================================================
>> --- linux-2.6.orig/mm/memory.c
>> +++ linux-2.6/mm/memory.c
>> @@ -1944,6 +1944,7 @@ static int do_wp_page(struct mm_struct *
>> * read-only shared pages can get COWed by
>> * get_user_pages(.write=1, .force=1).
>> */
>> + vfs_check_frozen(old_page->mapping->host->i_sb, SB_FREEZE_WRITE);
>> if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
>> struct vm_fault vmf;
>> int tmp;
>
> it seems strage.
>
> 1. it seems to have a race
>
> CPU0 CPU1
> ----------------------------------------------------
> do_wp_page
> vfs_check_frozen
> ioctl_fsfreeze
> freeze_bdev
> __fsync_super
> process touch mem
>
> vfs_check_frozen only wait to unfreeze, but not prevent new
> new freeze request starting.

Well, I think that is ok. I don't *think* that any IO can actually
happen to the filesystem even if it gets dirtied via mmap, so if a bit
of mmap-dirtied memory sneaks in before it's actually frozen, I'm not
sure that's really a problem. The goal was to prevent massive amounts
of memory from getting dirtied, backed by the frozen filesystem. This
would potentially lead to a situation where the un-freezing thread was
stuck waiting for memory to free up, stuck behind waiting for the
filesystem to unfreeze for writeout, and we can't unfreeze.

> 2. this logic kill multi thread application.
>
> this logic mean mmap_sem grabbing until unfreeze.
> it mean othrer thread in the same process can't page-fault although
> it don't touch frozen-sb.
> it seems strange.

Hm, I hadn't thought about this ... On the one hand, ->page_mkwrite can
already sleep, though a userspace freeze/unfreeze could potentially take
much much longer. freeze/unfreeze *should* happen very quickly, but
nothing enforces that.

Do you have any suggestions?

Thanks,
-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/