Re: [PATCH v2 1/2] erofs: update on-disk format for xattr name filter

From: Gao Xiang
Date: Wed Jul 05 2023 - 04:14:10 EST




On 2023/7/5 16:12, Alexander Larsson wrote:
On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:



On 2023/7/5 15:43, Alexander Larsson wrote:
On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:



On 2023/7/5 15:04, Jingbo Xu wrote:
The xattr name bloom filter feature is going to be introduced to speed
up the negative xattr lookup, e.g. system.posix_acl_[access|default]
lookup when running "ls -lR" workload.

The number of common used extended attributes (n) is approximately 30.

There are some commonly used extended attributes (n) and the total number
of these is 31:


trusted.overlay.opaque
trusted.overlay.redirect
trusted.overlay.origin
trusted.overlay.impure
trusted.overlay.nlink
trusted.overlay.upper
trusted.overlay.metacopy
trusted.overlay.protattr
user.overlay.opaque
user.overlay.redirect
user.overlay.origin
user.overlay.impure
user.overlay.nlink
user.overlay.upper
user.overlay.metacopy
user.overlay.protattr
security.evm
security.ima
security.selinux
security.SMACK64
security.SMACK64IPIN
security.SMACK64IPOUT
security.SMACK64EXEC
security.SMACK64TRANSMUTE
security.SMACK64MMAP
security.apparmor
security.capability
system.posix_acl_access
system.posix_acl_default
user.mime_type

Given the number of bits of the bloom filter (m) is 32, the optimal
value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).

The single hash function is implemented as:

xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)

where index represents the index of corresponding predefined short name

where `index`...



prefix, while name represents the name string after stripping the above
predefined name prefix.

The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
used to give a better spread when mapping these 30 extended attributes
into 32-bit bloom filter as:

bit 0: security.ima
bit 1:
bit 2: trusted.overlay.nlink
bit 3:
bit 4: user.overlay.nlink
bit 5: trusted.overlay.upper
bit 6: user.overlay.origin
bit 7: trusted.overlay.protattr
bit 8: security.apparmor
bit 9: user.overlay.protattr
bit 10: user.overlay.opaque
bit 11: security.selinux
bit 12: security.SMACK64TRANSMUTE
bit 13: security.SMACK64
bit 14: security.SMACK64MMAP
bit 15: user.overlay.impure
bit 16: security.SMACK64IPIN
bit 17: trusted.overlay.redirect
bit 18: trusted.overlay.origin
bit 19: security.SMACK64IPOUT
bit 20: trusted.overlay.opaque
bit 21: system.posix_acl_default
bit 22:
bit 23: user.mime_type
bit 24: trusted.overlay.impure
bit 25: security.SMACK64EXEC
bit 26: user.overlay.redirect
bit 27: user.overlay.upper
bit 28: security.evm
bit 29: security.capability
bit 30: system.posix_acl_access
bit 31: trusted.overlay.metacopy, user.overlay.metacopy

The h_name_filter field is introduced to the on-disk per-inode xattr
header to place the corresponding xattr name filter, where bit value 1
indicates non-existence for compatibility.

This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
compatible feature bit.

Suggested-by: Alexander Larsson <alexl@xxxxxxxxxx>
Signed-off-by: Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx>
---
fs/erofs/erofs_fs.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index 2c7b16e340fe..b4b6235fd720 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -13,6 +13,7 @@

#define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001
#define EROFS_FEATURE_COMPAT_MTIME 0x00000002
+#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004

I'd suggest that if we could leave one reserved byte in the
superblock for now (and checking if it's 0) since
1) xattr filter feature is a compatible feature;
2) I'm not sure if the implementation could be changed.

so that later implementation changes won't bother compat bits
again.

I would very much like to generate these bloom filters in composefs
right now, before the composefs v1 format is completely locked down,
and this should be fully possible given that this is a backwards
compat change. But this is only possible if it doesn't require a
feature flag like this that makes old erofs versions not mount the
image.

EROFS has two types of feature bits:

1) compat flags, which doesn't block mounting on old kernels;
2) incompat flags, which will block mounting on old kernels.

here bloom filter use a new compat flag, so old kernels will just
ignore this and mount. compat flags just indicates that "an image
with a feature, and you could use it or not".

Here I just meant the bloom filter internals are fixed for now,
so that we might reserve a byte in the on-disk super block for
later potential changes (if any). And don't need to bother another
new compat flag.

Cool. Then we're all good!

:)