Re: BIG files & file systems

From: Andrew Morton (
Date: Thu Aug 01 2002 - 15:33:44 EST

"Peter J. Braam" wrote:
> Hi,
> I've just been told that some "limitations" of the following kind will
> remain:
> page index = unsigned long
> ino_t = unsigned long
> Lustre has definitely been asked to support much larger files than
> 16TB. Also file systems with a trillion files have been requested by
> one of our supporters (you don't want to know who, besides I've no
> idea how many bits go in a trillion, but it's more than 32).
> I understand why people don't want to sprinkle the kernel with u64's,
> and arguably we can wait a year or two and use 64 bit architectures,
> so I'm probably not going to kick up a fuss about it.
> However, I thought I'd let you know that there are organizations that
> _really_ want to have such big files and file systems and get quite
> dismayed about "small integers". And we will fail to deliver on a
> requirement to write a 50TB file because of this.

I don't know about the ino_t thing, but as far as the pagecache
indices goes it's simply a matter of

- s/unsigned long/pgoff_t/ in a zillion places
- modify the radix tree code a bit
- make it all work
- convince Linus

Linus's objections are threefold: it expands struct page, 64 bit
arith is slow and gcc tends to get it wrong. And I would add "most
developers won't test 64-bit pgoff_t, and it'll get broken regularly".

The expansion of struct page and the performance impact is just a
cost which you'll have to balance against the benefits. For a few
people, 32-bit pagecache index is a showstopper and they'll accept that

Sprinkling `pgoff_t' everywhere is, IMO, not a bad thing - it aids code
readability because it tells you what the variable is used for.

As for broken gcc, well, the proponents of 64-bit pgoff_t would have
to work to identify the correct gcc version and generally get gcc
doing the right thing.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:16 EST