Re: Could MADV_REMOVE work with hugetlbfs?

From: Hugh Dickins
Date: Wed Nov 16 2011 - 21:18:50 EST


On Wed, 9 Nov 2011, Adam M. Costello wrote:

> madvise(MADV_REMOVE) punches a hole in a file (via truncate_range()).
> It's the only way (as far as I know) to return an arbitrary page of
> shared memory to the system. It's supported by tmpfs, but not by
> hugetlbfs.
>
> So a regular page of shared memory (from shmget() or from mmap() on
> a tmpfs file) can be returned, but huge page of shared memory (from
> shmget(SHM_HUGETLB) or from mmap() on a hugetlbfs file) cannot be
> returned.
>
> Has anyone thought about adding support for truncate_range and
> MADV_REMOVE to hugetlbfs? How hard would that be?

My guess it that it would be awkward. It should be straightforward
if hugetlbfs were like other filesystems, but it involves peculiar
hugepage reservation code (needed for reliable operation, since the
units it deals in are huge and few), which might well defeat your
purpose completely - even if you free the punched pages, it may still
need to reserve them for your future use, and hence no advantage;
or else make it difficult to implement in a consistent unbuggy way.

(And regarding truncate_range: now that xfs and ext4 and others are
punching holes with fallocate(FALLOC_FL_PUNCH_HOLE), I do intend to
remove the truncate_range method in due course, wiring up
madvise(MADV_REMOVE) and tmpfs to use FALLOC_FL_PUNCH_HOLE instead.
The same functionality, but extended to other filesystems - so long
as it's simply done, not to complicate everybody's test matrix.)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/