Re: [RFC] Heads up on sys_fallocate()

From: JÃrn Engel
Date: Mon Mar 05 2007 - 08:04:00 EST


On Mon, 5 March 2007 01:36:36 +0100, Arnd Bergmann wrote:
>
> Using the current glibc implementation on a compressed file system ideally
> should be a very expensive no-op because you won't actually allocate much
> space for a file when writing zeroes to it. You also don't benefit of a
> contiguous allocation in logfs, since flash has uniform seek times over
> all the medium.
>
> I'd suggest you implement posix_fallocate as an real nop and just return
> success without doing anything. You could also return ENOSPC in case
> the blocks requested by posix_fallocate don't fit on the medium without
> compression, but that is more or less just guesswork (like statfs is).

Quoting POSIX_FALLOCATE(3):
The function posix_fallocate() ensures that disk space is allocated for
the file referred to by the descriptor fd for the bytes in the range
starting at offset and continuing for len bytes. After a successful
call to posix_fallocate(), subsequent writes to bytes in the specified
range are guaranteed not to fail because of lack of disk space.

If the size of the file is less than offset+len, then the file is
increased to this size; otherwise the file size is left unchanged.

Afaics, the (main) purpose of this function is not to decrease
fragmentation but to ensure mmap() won't cause any problems because the
medium fills up. That problem exists for LogFS as well, once rw mmap()
is supported.

Simply returning success without doing anything would be a bug. -ENOSPC
is a better choice, but still a lame implementation. And falling back
on libc to write zeroes in a loop is an exercise in futility.

Does the allocation have to be persistent beyond lifetime of the file
descriptor? It would be fairly simple to support the write guarantee
while the file is open (or rather the inode remains cached) and drop it
afterwards.

JÃrn

--
"[One] doesn't need to know [...] how to cause a headache in order
to take an aspirin."
-- Scott Culp, Manager of the Microsoft Security Response Center, 2001
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/