Re: [RFC 0/2] erofs: introduce bloom filter for xattr

From: Alexander Larsson
Date: Mon Jul 03 2023 - 03:26:48 EST


On Wed, Jun 28, 2023 at 5:38 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi all,
>
> Sorry for the late reply as I was on vacation these days.
>
> I test the hash bit for all xattrs given by Alex[1], to see if each
> xattr could be mapped into one unique bit in the 32-bit bloom filter.
>
> [1]
> https://lore.kernel.org/all/CAL7ro1HhYUDrOX7A-13p7rLBZSWHTQWGOdOzVcYkddkU_LArUw@xxxxxxxxxxxxxx/
>
>
> On 6/21/23 4:32 PM, Jingbo Xu wrote:
> >
> > 3.2. input of hash function
> > -------------------------
> > As previously described, each hash function will map the given data into
> > one bit of the bloom filter map. In our use case, xattr name serves as
> > the key of hash function.
> >
> > When .getxattr() gets called, only index (e.g. EROFS_XATTR_INDEX_USER)
> > and the remaining name apart from the prefix are handy. To avoid
> > constructing the full xattr name, the above index and name are fed into
> > the hash function directly in the following way:
> >
> > ```
> > bit = xxh32(name, strlen(name), index + i);
> > ```
> >
> > where index serves as part of seed, so that it gets involved in the
> > calculation for the hash.
>
>
> All xattrs are hashed with one single hash function.
>
> I first tested with the following hash function:
>
> ```
> xxh32(name, strlen(name), index)
> ```
>
> where `index` represents the index of corresponding predefined name
> prefix (e.g. EROFS_XATTR_INDEX_USER), while `name` represents the name
> after stripping the above predefined name prefix (e.g.
> "overlay.metacopy" for "user.overlay.metacopy")
>
>
> The mapping results are:
>
> bit 0: security.SMACK64EXEC
> bit 1:
> bit 2: user.overlay.protattr
> bit 3: trusted.overlay.impure, user.overlay.opaque, user.mime_type
> bit 4:
> bit 5: user.overlay.origin
> bit 6: user.overlay.metacopy, security.evm
> bit 8: trusted.overlay.opaque
> bit 9: trusted.overlay.origin
> bit 10: trusted.overlay.upper, trusted.overlay.protattr
> bit 11: security.apparmor, security.capability
> bit 12: security.SMACK64
> bit 13: user.overlay.redirect, security.ima
> bit 14: user.overlay.upper
> bit 15: trusted.overlay.redirect
> bit 16: security.SMACK64IPOUT
> bit 17:
> bit 18: system.posix_acl_access
> bit 19: security.selinux
> bit 20:
> bit 21:
> bit 22: system.posix_acl_default
> bit 23: security.SMACK64MMAP
> bit 24: user.overlay.impure, user.overlay.nlink, security.SMACK64TRANSMUTE
> bit 25: trusted.overlay.metacopy
> bit 26:
> bit 27: security.SMACK64IPIN
> bit 28:
> bit 29:
> bit 30: trusted.overlay.nlink
> bit 31:
>
> Here 30 xattrs are mapped into 22 bits. There are two potential
> conflicts, i.e. bit 10 (trusted.overlay.upper, trusted.overlay.protattr)
> and bit 24 (user.overlay.impure, user.overlay.nlink).

Bit 11 (apparmor and capabilities) seems like the most likely thing to
run into. I.e. on an apparmor-using system, many files would have
apparmor xattr set, so looking up security.capabilities on it would
cause a false negative and we'd unnecessarily read the xattrs.

> > An alternative way is to calculate the hash from the full xattr name by
> > feeding the prefix string and the remaining name string separately in
> > the following way:
> >
> > ```
> > xxh32_reset()
> > xxh32_update(prefix string, ...)
> > xxh32_update(remaining name, ...)
> > xxh32_digest()
> > ```
> >
> > But I doubt if it really deserves to call multiple APIs instead of one
> > single xxh32().
>
>
> I also tested with the following hash function, where the full name of
> the xattr, e.g. "user.overlay.metacopy", is fed into the hash function.
>
> ```
> xxh32(name, strlen(name), 0)
> ```
>
>
> Following are the mapping results:
>
> bit 0: trusted.overlay.impure, user.overlay.protattr
> bit 1: security.SMACK64IPOUT
> bit 2:
> bit 3: security.capability
> bit 4: security.selinux
> bit 5: security.ima
> bit 6: user.overlay.metacopy
> bit 8:
> bit 9: trusted.overlay.redirect, security.SMACK64EXEC
> bit 10: system.posix_acl_access
> bit 11: trusted.overlay.nlink
> bit 12: trusted.overlay.opaque
> bit 13:
> bit 14:
> bit 15:
> bit 16:
> bit 17: user.overlay.impure
> bit 18: security.apparmor
> bit 19:
> bit 20: user.overlay.origin, user.overlay.nlink, security.SMACK64TRANSMUTE
> bit 21:
> bit 22: trusted.overlay.metacopy, trusted.overlay.protattr
> bit 23: user.overlay.upper, security.evm
> bit 24: user.overlay.redirect, security.SMACK64IPIN,
> system.posix_acl_default
> bit 25: security.SMACK64
> bit 26:
> bit 27: trusted.overlay.upper, security.SMACK64MMAP
> bit 28: trusted.overlay.origin, user.mime_type
> bit 29:
> bit 30:
> bit 31: user.overlay.opaque
>
> 30 xattrs are mapped into 20 bits. Similarly there are two potential
> conflicts, i.e. bit 20 (user.overlay.origin, user.overlay.nlink) and bit
> 22 (trusted.overlay.metacopy, trusted.overlay.protattr).
>
>
> Summary
> =======
>
> Personally I would prefer the former, as it maps xattrs into the bloom
> filter more evenly (22 bits vs 20 bits) and can better cooperate with
> the kernel routine (index and the remaining name string, rather than the
> full name string, are handy).

I agree that we want the approach with better cooperation with the
kernel function. However, I would much prefer if all the xattrs that
are commonly set on many files are unconflicted. This would be at
least: selinux, ima, evm, apparmor.

Can't you just add a magic constant to the seed? Then we can come up
with one that gives a good spread and hardcode that.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx