Re: [PATCH v2] erofs: deprecate superblock checksum feature

From: Jingbo Xu
Date: Mon Jul 31 2023 - 22:47:57 EST


Hi, Thomas,

On 7/30/23 10:28 PM, Thomas Weißschuh wrote:
> Hi Gao!
>
> On 2023-07-30 22:01:11+0800, Gao Xiang wrote:
>> On 2023/7/30 21:31, Thomas Weißschuh wrote:
>>> On 2023-07-17 19:27:03+0800, Jingbo Xu wrote:
>>>> Later we're going to try the self-contained image verification.
>>>> The current superblock checksum feature has quite limited
>>>> functionality, instead, merkle trees can provide better protection
>>>> for image integrity.
>>>
>>> The crc32c checksum is also used by libblkid to gain more confidence
>>> in its filesystem detection.
>>> I guess a merkle tree would be much harder to implement.
>>>
>>> This is for example used by the mount(8) cli program to allow mounting
>>> of devices without explicitly needing to specify a filesystem.
>>>
>>> Note: libblkid tests for EROFS_FEATURE_SB_CSUM so at least it won't
>>> break when the checksum is removed.
>
>> I'm not sure if we could switch EROFS_FEATURE_SB_CSUM to a simpler
>> checksum instead (e.g. just sum each byte up if both
>> EROFS_FEATURE_SB_CSUM and COMPAT_XATTR_FILTER bits are set, or
>> ignore checksums completely at least in the kernel) if the better
>> filesystem detection by using sb chksum is needed (not sure if other
>> filesystems have sb chksum or just do magic comparsion)?
>
> Overloading EROFS_FEATURE_SB_CSUM in combination with
> COMPAT_XATTR_FILTER would break all existing deployments of libblkid, so
> it's not an option.
>
> All other serious and halfway modern filesystems do have superblock
> checksums which are also checked by libblkid.
>
>> The main problem here is after xattr name filter feature is added
>> (xxhash is generally faster than crc32c), there could be two
>> hard-depended hashing algorithms, this increases more dependency
>> especially for embededed devices.
>
> From libblkid side nothing really speaks against a simpler checksum.
> XOR is easy to implement and xxhash is already part of libblkid for
> other filesystems.
>
> The drawbacks are:
> * It would need a completely new feature bit in erofs.
> * Old versions of libblkid could not validate checksums on newer
> filesystems.

Thanks for pointing this out. we indeed need further discussion for a
better solution.

As mentioned previously, we don't want two hashing algorithms dependency
for erofs. The best idea as far as I can come up with is that,
introduce a new feature bit indicating XOR hashing algorithm for the sb
checksum, while the original EROFS_FEATURE_SB_CSUM is not set. As for
the old version libblkid, only fs magic is available for the fs type
detection, not perfect but in a best-effort way.

--
Thanks,
Jingbo