Re: fsync on large files

Andrea Arcangeli (andrea@e-mind.com)
Mon, 8 Feb 1999 00:14:19 +0100 (CET)


On Sun, 7 Feb 1999, Andrew Morgan wrote:

>We saw it here too. The problem was that the hash table for finding
>entries in the file buffer cache was not large enough to avoid a
>significant number of collisions on large memory machines.
>
>IIRC, fsync() worked by checking to see if any of the blocks associated
>with a file were both in memory and dirty - writing them to disk where
>necessary. The detail of the problem was that _every_ hash bucket was
>populated so we had to walk down a linked list looking for each block in
>the file. Most of the time we walked down the list to the end only to
>discover that the block was, after all, not in memory. Increasing the
>size of the buffer cache's hash table dramatically shortened the length
>of the linked lists and helped a great deal.
>
>Based on our experience with 2.2.1, this problem is no longer a
>mis-feature of the kernel. From memory (I've not got any notes) I think
>the place to look is in linux/fs/buffer.c, the HASH_PAGES value is needs
>to be increased (128 might be a good thing to try). You might double

My ftp://e-mind.com/pub/linux/arca-tree/2.2.2-pre2_arca-1.gz just alloc a
dynamic number of pages for the buffer hash in function of the total
memory of the machines.

On 128Mbyte of ram it uses 32 pages.
On 256Mbyte of ram it uses 64 pages.

At bootup you'll see a printk that tell you how many pages are used for
the buffer hash.

I also just enlarged the page cache hash table from 11 bit to 13 bit.

I think that we should try to use RB-trees for both the buffer and the
cache because such things are very dynamic in size.

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/