[inline_data] ext4: Stale flags before sync when convert to non-inline

From: Daniel Dawson
Date: Wed Nov 29 2023 - 01:15:22 EST


When a file is converted from inline to non-inline, it has stale flags until sync.

If a file with inline data is written to such that it must become non-inline, it temporarily appears to have the inline data flag and not (if applicable) the extent flag. This is corrected on sync, but can cause problems in certain situations.

Details:

All that is needed to show this behavior is the following command:

$ rm -r test-file; dd if=/dev/zero of=test-file bs=64 count=3 status=none; lsattr test-file

Assuming extents are in use, this should show

--------------e------- test-file

but instead shows

------------------N--- test-file

until test-file is synced. Despite this, the file is already non-inline and is treated as such for most purposes.

Why is this a problem? Because some code will fail under such a condition, for example, lseek(..., SEEK_HOLE) will result in ENOENT. I ran into this with Gentoo's Portage, which uses the call to handle sparse files when copying. Sometimes, an ebuild creates a temporary file that is quickly copied, and apparently the temporary is written in small increments, triggering this.

Here is a small program that reproduces the SEEK_HOLE problem (pass it the pathname of a file to create):
https://gist.github.com/ddawson/22cfd4cac32916f6f1dcc86f90eed21a

Tested with kernel: 6.7.0-rc3 (also 6.6 series)
/proc/version: Linux version 6.7.0-rc3 (ddawson@ddawson.local) (gcc (Gentoo 13.2.1_p20231014 p8) 13.2.1 20231014, GNU ld (Gentoo 2.41 p2) 2.41.0) #4 SMP PREEMPT_DYNAMIC Tue Nov 28 20:09:05 PST 2023
Operating System: Gentoo Linux
uname -mi: x86_64 GenuineIntel
.config: https://gist.github.com/ddawson/2f2e60c6e44a62047d7b7d99c7ce5632
dmesg output: https://gist.github.com/ddawson/026ea63f099ee3e0c301f522dff00764

--
PGP fingerprint: 5BBD5080FEB0EF7F142F8173D572B791F7B4422A