Re: LBD/filesystems over 2TB: is it safe?

From: Peter Chubb
Date: Mon Mar 21 2005 - 19:38:18 EST


>>>>> "jniehof" == jniehof <jniehof@xxxxxx> writes:

jniehof> Someone posted to the LBD list last December regarding some
jniehof> supposedly horrible bugs in large filesystems:
jniehof> https://www.gelato.unsw.edu.au/archives/lbd/2004-December/000075.html
jniehof> https://www.gelato.unsw.edu.au/archives/lbd/2004-December/000074.html

The changes in those emails are irrelevant --- they fail to take into
account the properties of the filesystems that they modify, that mean
that the 32-bit quantities being shifted will not overflow.

They're typically of the form:
- iblock = index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
+ iblock = (sector_t) index << (PAGE_CACHE_SHIFT - inode->i_blkbits);

Now, on a 32-bit processor with 4k pages, PAGE_CACHE_SHIFT is 12, and
i_blkbits is also 12 if you're using 4k blocks (which you have to to
get a large filesystem). So this does nothing and is safe. The
on-disk format for ext[23] uses 32-bit block numbers, so your maximum
filesystem size is 16TB, and your maximum value of iblock is 2^32-1.

Please do benchmark XFS and ext3 on your system before choosing. Our
tests (to be published in Linux.Conf.Au next month) show that XFS is
significantly faster for some workloads.
Also its scalability to very large filesystems is much more mature than ext3.

--
Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au
The technical we do immediately, the political takes *forever*
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/