Re: [PATCH v4 1/2] hugetlb: use f_mode & FMODE_HUGETLBFS to identify hugetlbfs files

From: Amir Goldstein
Date: Sat Jun 13 2020 - 02:53:50 EST


> > Incidentally, can a hugetlbfs be a lower layer, while the upper one
> > is a normal filesystem? What should happen on copyup?
>
> Yes, that seems to work as expected. When accessed for write the hugetlb
> file is copied to the normal filesystem.
>
> The BUG found by syzbot actually has a single hugetlbfs as both lower and
> upper. With the BUG 'fixed', I am not exactly sure what the expected
> behavior is in this case. I may be wrong, but I would expect any operations
> that can be performed on a stand alone hugetlbfs to also be performed on
> the overlay. However, mmap() still fails. I will look into it.
>
> I also looked at normal filesystem lower and hugetlbfs upper. Yes, overlayfs
> allows this. This is somewhat 'interesting' as write() is not supported in
> hugetlbfs. Writing to files in the overlay actually ended up writing to
> files in the lower filesystem. That seems wrong, but overlayfs is new to me.
>

I am not sure how that happened, but I think that ovl_open_realfile()
needs to fixup f_mode flags FMODE_CAN_WRITE | FMODE_CAN_READ
after open_with_fake_path().

> Earlier in the discussion of these issues, Colin Walters asked "Is there any
> actual valid use case for mounting an overlayfs on top of hugetlbfs?" I can
> not think of one. Perhaps we should consider limiting the ways in which
> hugetlbfs can be used in overlayfs? Preventing it from being an upper
> filesystem might be a good start? Or, do people think making hugetlbfs and
> overlayfs play nice together is useful?

If people think that making hugetlbfs and overlayfs play nice together maybe
they should work on this problem. It doesn't look like either
hugetlbfs developers
nor overlayfs developers care much about the combination.
Your concern, I assume, is fixing the syzbot issue.

I agree with Colin's remark about adding limitations, but it would be a shame
if overlay had to special case hugetlbfs. It would have been better if we could
find a property of hugetlbfs that makes it inapplicable for overlayfs
upper/lower
or stacking fs in general.

The simplest thing for you to do in order to shush syzbot is what procfs does:
/*
* procfs isn't actually a stacking filesystem; however, there is
* too much magic going on inside it to permit stacking things on
* top of it
*/
s->s_stack_depth = FILESYSTEM_MAX_STACK_DEPTH;

Currently, the only in-tree stacking fs are overlayfs and ecryptfs, but there
are some out of tree implementations as well (shiftfs).
So you may only take that option if you do not care about the combination
of hugetlbfs with any of the above.

overlayfs support of mmap is not as good as one might hope.
overlayfs.rst says:
"If a file residing on a lower layer is opened for read-only and then
memory mapped with MAP_SHARED, then subsequent changes to
the file are not reflected in the memory mapping."

So if I were you, I wouldn't go trying to fix overlayfs-huguetlb interop...

Thanks,
Amir.