Re: [RFC 0/2] erofs: introduce bloom filter for xattr

From: Jingbo Xu
Date: Mon Jul 03 2023 - 05:07:42 EST




On 7/3/23 3:25 PM, Alexander Larsson wrote:
> On Wed, Jun 28, 2023 at 5:38 AM Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
>>
>> Hi all,
>>
>> Sorry for the late reply as I was on vacation these days.
>>
>> I test the hash bit for all xattrs given by Alex[1], to see if each
>> xattr could be mapped into one unique bit in the 32-bit bloom filter.
>>
>> [1]
>> https://lore.kernel.org/all/CAL7ro1HhYUDrOX7A-13p7rLBZSWHTQWGOdOzVcYkddkU_LArUw@xxxxxxxxxxxxxx/
>>
>>
>> On 6/21/23 4:32 PM, Jingbo Xu wrote:
>>>
>>> 3.2. input of hash function
>>> -------------------------
>>> As previously described, each hash function will map the given data into
>>> one bit of the bloom filter map. In our use case, xattr name serves as
>>> the key of hash function.
>>>
>>> When .getxattr() gets called, only index (e.g. EROFS_XATTR_INDEX_USER)
>>> and the remaining name apart from the prefix are handy. To avoid
>>> constructing the full xattr name, the above index and name are fed into
>>> the hash function directly in the following way:
>>>
>>> ```
>>> bit = xxh32(name, strlen(name), index + i);
>>> ```
>>>
>>> where index serves as part of seed, so that it gets involved in the
>>> calculation for the hash.
>>
>>
>> All xattrs are hashed with one single hash function.
>>
>> I first tested with the following hash function:
>>
>> ```
>> xxh32(name, strlen(name), index)
>> ```
>>
>> where `index` represents the index of corresponding predefined name
>> prefix (e.g. EROFS_XATTR_INDEX_USER), while `name` represents the name
>> after stripping the above predefined name prefix (e.g.
>> "overlay.metacopy" for "user.overlay.metacopy")
>>
>>
>> The mapping results are:
>>
>> bit 0: security.SMACK64EXEC
>> bit 1:
>> bit 2: user.overlay.protattr
>> bit 3: trusted.overlay.impure, user.overlay.opaque, user.mime_type
>> bit 4:
>> bit 5: user.overlay.origin
>> bit 6: user.overlay.metacopy, security.evm
>> bit 8: trusted.overlay.opaque
>> bit 9: trusted.overlay.origin
>> bit 10: trusted.overlay.upper, trusted.overlay.protattr
>> bit 11: security.apparmor, security.capability
>> bit 12: security.SMACK64
>> bit 13: user.overlay.redirect, security.ima
>> bit 14: user.overlay.upper
>> bit 15: trusted.overlay.redirect
>> bit 16: security.SMACK64IPOUT
>> bit 17:
>> bit 18: system.posix_acl_access
>> bit 19: security.selinux
>> bit 20:
>> bit 21:
>> bit 22: system.posix_acl_default
>> bit 23: security.SMACK64MMAP
>> bit 24: user.overlay.impure, user.overlay.nlink, security.SMACK64TRANSMUTE
>> bit 25: trusted.overlay.metacopy
>> bit 26:
>> bit 27: security.SMACK64IPIN
>> bit 28:
>> bit 29:
>> bit 30: trusted.overlay.nlink
>> bit 31:
>>
>> Here 30 xattrs are mapped into 22 bits. There are two potential
>> conflicts, i.e. bit 10 (trusted.overlay.upper, trusted.overlay.protattr)
>> and bit 24 (user.overlay.impure, user.overlay.nlink).
>
> Bit 11 (apparmor and capabilities) seems like the most likely thing to
> run into. I.e. on an apparmor-using system, many files would have
> apparmor xattr set, so looking up security.capabilities on it would
> cause a false negative and we'd unnecessarily read the xattrs.
>
>>> An alternative way is to calculate the hash from the full xattr name by
>>> feeding the prefix string and the remaining name string separately in
>>> the following way:
>>>
>>> ```
>>> xxh32_reset()
>>> xxh32_update(prefix string, ...)
>>> xxh32_update(remaining name, ...)
>>> xxh32_digest()
>>> ```
>>>
>>> But I doubt if it really deserves to call multiple APIs instead of one
>>> single xxh32().
>>
>>
>> I also tested with the following hash function, where the full name of
>> the xattr, e.g. "user.overlay.metacopy", is fed into the hash function.
>>
>> ```
>> xxh32(name, strlen(name), 0)
>> ```
>>
>>
>> Following are the mapping results:
>>
>> bit 0: trusted.overlay.impure, user.overlay.protattr
>> bit 1: security.SMACK64IPOUT
>> bit 2:
>> bit 3: security.capability
>> bit 4: security.selinux
>> bit 5: security.ima
>> bit 6: user.overlay.metacopy
>> bit 8:
>> bit 9: trusted.overlay.redirect, security.SMACK64EXEC
>> bit 10: system.posix_acl_access
>> bit 11: trusted.overlay.nlink
>> bit 12: trusted.overlay.opaque
>> bit 13:
>> bit 14:
>> bit 15:
>> bit 16:
>> bit 17: user.overlay.impure
>> bit 18: security.apparmor
>> bit 19:
>> bit 20: user.overlay.origin, user.overlay.nlink, security.SMACK64TRANSMUTE
>> bit 21:
>> bit 22: trusted.overlay.metacopy, trusted.overlay.protattr
>> bit 23: user.overlay.upper, security.evm
>> bit 24: user.overlay.redirect, security.SMACK64IPIN,
>> system.posix_acl_default
>> bit 25: security.SMACK64
>> bit 26:
>> bit 27: trusted.overlay.upper, security.SMACK64MMAP
>> bit 28: trusted.overlay.origin, user.mime_type
>> bit 29:
>> bit 30:
>> bit 31: user.overlay.opaque
>>
>> 30 xattrs are mapped into 20 bits. Similarly there are two potential
>> conflicts, i.e. bit 20 (user.overlay.origin, user.overlay.nlink) and bit
>> 22 (trusted.overlay.metacopy, trusted.overlay.protattr).
>>
>>
>> Summary
>> =======
>>
>> Personally I would prefer the former, as it maps xattrs into the bloom
>> filter more evenly (22 bits vs 20 bits) and can better cooperate with
>> the kernel routine (index and the remaining name string, rather than the
>> full name string, are handy).
>
> I agree that we want the approach with better cooperation with the
> kernel function. However, I would much prefer if all the xattrs that
> are commonly set on many files are unconflicted. This would be at
> least: selinux, ima, evm, apparmor.
>
> Can't you just add a magic constant to the seed? Then we can come up
> with one that gives a good spread and hardcode that.

Brilliant idea! I would try to see if it works.

--
Thanks,
Jingbo