Re: page fault scalability (ext3, ext4, xfs)

From: Andy Lutomirski
Date: Mon Aug 19 2013 - 19:31:56 EST


On Mon, Aug 19, 2013 at 4:23 PM, David Lang <david@xxxxxxx> wrote:
> On Fri, 16 Aug 2013, Dave Chinner wrote:
>
>> The problem with "not exported, don't update" is that files can be
>> modified on server startup (e.g. after a crash) or in short
>> maintenance periods when the NFS service is down. When the server is
>> started back up, the change number needs to indicate the file has
>> been modified so that clients reconnecting to the server see the
>> change.
>>
>> IOWs, even if the NFS server is not up or the filesystem not
>> exported we still need to update change counts whenever a file
>> changes if we are going to tell the NFS server that we keep them...
>
>
> This sounds like you need something more like relctime rather than noctime,
> something that updates the time in ram, but doesn't insist on flushing it to
> disk immediatly, updating when convienient or when the file is closed.
>
> David Lang

I guess my patches could be extended to do this. In their current
form, when a pte dirty bit is transferred to a page (via page_mkclean
or unmap), the address_space is marked as needed a cmtime update. I
could add a mode in which even the normal write syscall path sets that
bit instead of immediately updating the timestamp. This could be a
nice speedup to non-mmap writers.

To avoid breaking things, things like fsync would need to force a
cmtime flush -- I doubt it would be okay for write; fsync; write;
fsync to leave the timestamp matching the first write.

I'd rather get comments on the current form of my patches and maybe
get them merged before looking at even more far-reaching extensions,
though.

--Andy

--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/