Re: [PATCH v2 1/2] erofs: update on-disk format for xattr name filter

From: Alexander Larsson
Date: Wed Jul 05 2023 - 03:44:12 EST


On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023/7/5 15:04, Jingbo Xu wrote:
> > The xattr name bloom filter feature is going to be introduced to speed
> > up the negative xattr lookup, e.g. system.posix_acl_[access|default]
> > lookup when running "ls -lR" workload.
> >
> > The number of common used extended attributes (n) is approximately 30.
>
> There are some commonly used extended attributes (n) and the total number
> of these is 31:
>
> >
> > trusted.overlay.opaque
> > trusted.overlay.redirect
> > trusted.overlay.origin
> > trusted.overlay.impure
> > trusted.overlay.nlink
> > trusted.overlay.upper
> > trusted.overlay.metacopy
> > trusted.overlay.protattr
> > user.overlay.opaque
> > user.overlay.redirect
> > user.overlay.origin
> > user.overlay.impure
> > user.overlay.nlink
> > user.overlay.upper
> > user.overlay.metacopy
> > user.overlay.protattr
> > security.evm
> > security.ima
> > security.selinux
> > security.SMACK64
> > security.SMACK64IPIN
> > security.SMACK64IPOUT
> > security.SMACK64EXEC
> > security.SMACK64TRANSMUTE
> > security.SMACK64MMAP
> > security.apparmor
> > security.capability
> > system.posix_acl_access
> > system.posix_acl_default
> > user.mime_type
> >
> > Given the number of bits of the bloom filter (m) is 32, the optimal
> > value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
> >
> > The single hash function is implemented as:
> >
> > xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
> >
> > where index represents the index of corresponding predefined short name
>
> where `index`...
>
>
>
> > prefix, while name represents the name string after stripping the above
> > predefined name prefix.
> >
> > The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
> > used to give a better spread when mapping these 30 extended attributes
> > into 32-bit bloom filter as:
> >
> > bit 0: security.ima
> > bit 1:
> > bit 2: trusted.overlay.nlink
> > bit 3:
> > bit 4: user.overlay.nlink
> > bit 5: trusted.overlay.upper
> > bit 6: user.overlay.origin
> > bit 7: trusted.overlay.protattr
> > bit 8: security.apparmor
> > bit 9: user.overlay.protattr
> > bit 10: user.overlay.opaque
> > bit 11: security.selinux
> > bit 12: security.SMACK64TRANSMUTE
> > bit 13: security.SMACK64
> > bit 14: security.SMACK64MMAP
> > bit 15: user.overlay.impure
> > bit 16: security.SMACK64IPIN
> > bit 17: trusted.overlay.redirect
> > bit 18: trusted.overlay.origin
> > bit 19: security.SMACK64IPOUT
> > bit 20: trusted.overlay.opaque
> > bit 21: system.posix_acl_default
> > bit 22:
> > bit 23: user.mime_type
> > bit 24: trusted.overlay.impure
> > bit 25: security.SMACK64EXEC
> > bit 26: user.overlay.redirect
> > bit 27: user.overlay.upper
> > bit 28: security.evm
> > bit 29: security.capability
> > bit 30: system.posix_acl_access
> > bit 31: trusted.overlay.metacopy, user.overlay.metacopy
> >
> > The h_name_filter field is introduced to the on-disk per-inode xattr
> > header to place the corresponding xattr name filter, where bit value 1
> > indicates non-existence for compatibility.
> >
> > This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
> > compatible feature bit.
> >
> > Suggested-by: Alexander Larsson <alexl@xxxxxxxxxx>
> > Signed-off-by: Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx>
> > ---
> > fs/erofs/erofs_fs.h | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> > index 2c7b16e340fe..b4b6235fd720 100644
> > --- a/fs/erofs/erofs_fs.h
> > +++ b/fs/erofs/erofs_fs.h
> > @@ -13,6 +13,7 @@
> >
> > #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001
> > #define EROFS_FEATURE_COMPAT_MTIME 0x00000002
> > +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004
>
> I'd suggest that if we could leave one reserved byte in the
> superblock for now (and checking if it's 0) since
> 1) xattr filter feature is a compatible feature;
> 2) I'm not sure if the implementation could be changed.
>
> so that later implementation changes won't bother compat bits
> again.

I would very much like to generate these bloom filters in composefs
right now, before the composefs v1 format is completely locked down,
and this should be fully possible given that this is a backwards
compat change. But this is only possible if it doesn't require a
feature flag like this that makes old erofs versions not mount the
image.


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx