Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem

From: Alexander Larsson
Date: Wed Feb 01 2023 - 04:47:47 EST


On Wed, 2023-02-01 at 12:28 +0800, Jingbo Xu wrote:
> Hi all,
>
> There are some updated performance statistics with different
> combinations on my test environment if you are interested.
>
>
> On 1/27/23 6:24 PM, Gao Xiang wrote:
> > ...
> >
> > I've made a version and did some test, it can be fetched from:
> > git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
> > -b
> > experimental
> >
>
> Setup
> ======
> CPU: x86_64 Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz
> Disk: 6800 IOPS upper limit
> OS: Linux v6.2 (with composefs v3 patchset)

For the record, what was the filesystem backing the basedir files?

> I build erofs/squashfs images following the scripts attached on [1],
> with each file in the rootfs tagged with "metacopy" and "redirect"
> xattr.
>
> The source rootfs is from the docker image of tensorflow [2].
>
> The erofs images are built with mkfs.erofs with support for sparse
> file
> added [3].
>
> [1]
> https://lore.kernel.org/linux-fsdevel/5fb32a1297821040edd8c19ce796fc0540101653.camel@xxxxxxxxxx/
> [2]
> https://hub.docker.com/layers/tensorflow/tensorflow/2.10.0/images/sha256-7f9f23ce2473eb52d17fe1b465c79c3a3604047343e23acc036296f512071bc9?context=explore
> [3]
> https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/commit/?h=experimental&id=7c49e8b195ad90f6ca9dfccce9f6e3e39a8676f6
>
>
>
> Image size
> ===========
> 6.4M large.composefs
> 5.7M large.composefs.w/o.digest (w/o --compute-digest)
> 6.2M large.erofs
> 5.2M large.erofs.T0 (with -T0, i.e. w/o nanosecond timestamp)
> 1.7M large.squashfs
> 5.8M large.squashfs.uncompressed (with -noI -noD -noF -noX)
>
> (large.erofs.T0 is built without nanosecond timestamp, so that we get
> smaller disk inode size (same with squashfs).)
>
>
> Runtime Perf
> =============
>
> The "uncached" column is tested with:
> hyperfine -p "echo 3 > /proc/sys/vm/drop_caches" "ls -lR $MNTPOINT"
>
>
> While the "cached" column is tested with:
> hyperfine -w 1 "ls -lR $MNTPOINT"
>
>
> erofs and squashfs are mounted with loopback device.
>
>
>                                   | uncached(ms)| cached(ms)
> ----------------------------------|-------------|-----------
> composefs (with digest)           | 326         | 135
> erofs (w/o -T0)                   | 264         | 172
> erofs (w/o -T0) + overlayfs       | 651         | 238
> squashfs (compressed) | 538 | 211
> squashfs (compressed) + overlayfs | 968 | 302


Clearly erofs with sparse files is the best fs now for the ro-fs +
overlay case. But still, we can see that the additional cost of the
overlayfs layer is not negligible. 

According to amir this could be helped by a special composefs-like mode
in overlayfs, but its unclear what performance that would reach, and
we're then talking net new development that further complicates the
overlayfs codebase. Its not clear to me which alternative is easier to
develop/maintain.

Also, the difference between cached and uncached here is less than in
my tests. Probably because my test image was larger. With the test
image I use, the results are:

| uncached(ms)| cached(ms)
----------------------------------|-------------|-----------
composefs (with digest) | 681 | 390
erofs (w/o -T0) + overlayfs | 1788 | 532
squashfs (compressed) + overlayfs | 2547 | 443


I gotta say it is weird though that squashfs performed better than
erofs in the cached case. May be worth looking into. The test data I'm
using is available here:

https://my.owndrive.com/index.php/s/irHJXRpZHtT3a5i


--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=
Alexander Larsson Red Hat,
Inc
alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx
He's a lonely flyboy grifter living undercover at Ringling Bros.
Circus.
She's a virginal thirtysomething former first lady looking for love in
all the wrong places. They fight crime!