Re: [PATCH v10 00/12] ceph: support idmapped mounts

From: Aleksandr Mikhalitsyn
Date: Mon Oct 23 2023 - 03:58:02 EST


On Thu, Oct 19, 2023 at 7:42 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
>
>
> On 10/17/23 17:20, Aleksandr Mikhalitsyn wrote:
> > On Tue, Aug 8, 2023 at 2:45 AM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >> LGTM.
> >>
> >> Reviewed-by: Xiubo Li <xiubli@xxxxxxxxxx>
> >>
> >> I will queue this to the 'testing' branch and then we will run ceph qa
> >> tests.
> >>
> >> Thanks Alex.
> >>
> >> - Xiubo
> > Hi Xiubo,
> >
> > will this series be landed to 6.6?
> >
> > Userspace part was backported and merged to the Ceph Quincy release
> > (https://github.com/ceph/ceph/pull/53139)
> > And waiting to be tested and merged to the Ceph reef and pacific releases.
> > But the kernel part is still in the testing branch.
>
> This changes have been in the 'testing' branch for more than two mounts
> and well test, till now we haven't seen any issue.
>
> IMO it should be ready.

Thanks, Xiubo!
It would be awesome to have this in v.6.6.

Kind regards,
Alex

>
> Ilya ?
>
> Thanks
>
> - Xiubo
>
>
> > Kind regards,
> > Alex
> >
> >> On 8/7/23 21:26, Alexander Mikhalitsyn wrote:
> >>> Dear friends,
> >>>
> >>> This patchset was originally developed by Christian Brauner but I'll continue
> >>> to push it forward. Christian allowed me to do that :)
> >>>
> >>> This feature is already actively used/tested with LXD/LXC project.
> >>>
> >>> Git tree (based on https://github.com/ceph/ceph-client.git testing):
> >>> v10: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v10
> >>> current: https://github.com/mihalicyn/linux/tree/fs.idmapped.ceph
> >>>
> >>> In the version 3 I've changed only two commits:
> >>> - fs: export mnt_idmap_get/mnt_idmap_put
> >>> - ceph: allow idmapped setattr inode op
> >>> and added a new one:
> >>> - ceph: pass idmap to __ceph_setattr
> >>>
> >>> In the version 4 I've reworked the ("ceph: stash idmapping in mdsc request")
> >>> commit. Now we take idmap refcounter just in place where req->r_mnt_idmap
> >>> is filled. It's more safer approach and prevents possible refcounter underflow
> >>> on error paths where __register_request wasn't called but ceph_mdsc_release_request is
> >>> called.
> >>>
> >>> Changelog for version 5:
> >>> - a few commits were squashed into one (as suggested by Xiubo Li)
> >>> - started passing an idmapping everywhere (if possible), so a caller
> >>> UID/GID-s will be mapped almost everywhere (as suggested by Xiubo Li)
> >>>
> >>> Changelog for version 6:
> >>> - rebased on top of testing branch
> >>> - passed an idmapping in a few places (readdir, ceph_netfs_issue_op_inline)
> >>>
> >>> Changelog for version 7:
> >>> - rebased on top of testing branch
> >>> - this thing now requires a new cephfs protocol extension CEPHFS_FEATURE_HAS_OWNER_UIDGID
> >>> https://github.com/ceph/ceph/pull/52575
> >>>
> >>> Changelog for version 8:
> >>> - rebased on top of testing branch
> >>> - added enable_unsafe_idmap module parameter to make idmapped mounts
> >>> work with old MDS server versions
> >>> - properly handled case when old MDS used with new kernel client
> >>>
> >>> Changelog for version 9:
> >>> - added "struct_len" field in struct ceph_mds_request_head as requested by Xiubo Li
> >>>
> >>> Changelog for version 10:
> >>> - fill struct_len field properly (use cpu_to_le32)
> >>> - add extra checks IS_CEPH_MDS_OP_NEWINODE(..) as requested by Xiubo to match
> >>> userspace client behavior
> >>> - do not set req->r_mnt_idmap for MKSNAP operation
> >>> - atomic_open: set req->r_mnt_idmap only for CEPH_MDS_OP_CREATE as userspace client does
> >>>
> >>> I can confirm that this version passes xfstests and
> >>> tested with old MDS (without CEPHFS_FEATURE_HAS_OWNER_UIDGID)
> >>> and with recent MDS version.
> >>>
> >>> Links to previous versions:
> >>> v1: https://lore.kernel.org/all/20220104140414.155198-1-brauner@xxxxxxxxxx/
> >>> v2: https://lore.kernel.org/lkml/20230524153316.476973-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v2
> >>> v3: https://lore.kernel.org/lkml/20230607152038.469739-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/#t
> >>> v4: https://lore.kernel.org/lkml/20230607180958.645115-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/#t
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v4
> >>> v5: https://lore.kernel.org/lkml/20230608154256.562906-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/#t
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v5
> >>> v6: https://lore.kernel.org/lkml/20230609093125.252186-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v6
> >>> v7: https://lore.kernel.org/all/20230726141026.307690-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v7
> >>> v8: https://lore.kernel.org/all/20230803135955.230449-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/
> >>> tree: -
> >>> v9: https://lore.kernel.org/all/20230804084858.126104-1-aleksandr.mikhalitsyn@xxxxxxxxxxxxx/
> >>> tree: https://github.com/mihalicyn/linux/commits/fs.idmapped.ceph.v9
> >>>
> >>> Kind regards,
> >>> Alex
> >>>
> >>> Original description from Christian:
> >>> ========================================================================
> >>> This patch series enables cephfs to support idmapped mounts, i.e. the
> >>> ability to alter ownership information on a per-mount basis.
> >>>
> >>> Container managers such as LXD support sharaing data via cephfs between
> >>> the host and unprivileged containers and between unprivileged containers.
> >>> They may all use different idmappings. Idmapped mounts can be used to
> >>> create mounts with the idmapping used for the container (or a different
> >>> one specific to the use-case).
> >>>
> >>> There are in fact more use-cases such as remapping ownership for
> >>> mountpoints on the host itself to grant or restrict access to different
> >>> users or to make it possible to enforce that programs running as root
> >>> will write with a non-zero {g,u}id to disk.
> >>>
> >>> The patch series is simple overall and few changes are needed to cephfs.
> >>> There is one cephfs specific issue that I would like to discuss and
> >>> solve which I explain in detail in:
> >>>
> >>> [PATCH 02/12] ceph: handle idmapped mounts in create_request_message()
> >>>
> >>> It has to do with how to handle mds serves which have id-based access
> >>> restrictions configured. I would ask you to please take a look at the
> >>> explanation in the aforementioned patch.
> >>>
> >>> The patch series passes the vfs and idmapped mount testsuite as part of
> >>> xfstests. To run it you will need a config like:
> >>>
> >>> [ceph]
> >>> export FSTYP=ceph
> >>> export TEST_DIR=/mnt/test
> >>> export TEST_DEV=10.103.182.10:6789:/
> >>> export TEST_FS_MOUNT_OPTS="-o name=admin,secret=$password
> >>>
> >>> and then simply call
> >>>
> >>> sudo ./check -g idmapped
> >>>
> >>> ========================================================================
> >>>
> >>> Alexander Mikhalitsyn (3):
> >>> fs: export mnt_idmap_get/mnt_idmap_put
> >>> ceph: add enable_unsafe_idmap module parameter
> >>> ceph: pass idmap to __ceph_setattr
> >>>
> >>> Christian Brauner (9):
> >>> ceph: stash idmapping in mdsc request
> >>> ceph: handle idmapped mounts in create_request_message()
> >>> ceph: pass an idmapping to mknod/symlink/mkdir
> >>> ceph: allow idmapped getattr inode op
> >>> ceph: allow idmapped permission inode op
> >>> ceph: allow idmapped setattr inode op
> >>> ceph/acl: allow idmapped set_acl inode op
> >>> ceph/file: allow idmapped atomic_open inode op
> >>> ceph: allow idmapped mounts
> >>>
> >>> fs/ceph/acl.c | 6 +--
> >>> fs/ceph/crypto.c | 2 +-
> >>> fs/ceph/dir.c | 4 ++
> >>> fs/ceph/file.c | 11 ++++-
> >>> fs/ceph/inode.c | 29 +++++++------
> >>> fs/ceph/mds_client.c | 78 ++++++++++++++++++++++++++++++++---
> >>> fs/ceph/mds_client.h | 8 +++-
> >>> fs/ceph/super.c | 7 +++-
> >>> fs/ceph/super.h | 3 +-
> >>> fs/mnt_idmapping.c | 2 +
> >>> include/linux/ceph/ceph_fs.h | 10 ++++-
> >>> include/linux/mnt_idmapping.h | 3 ++
> >>> 12 files changed, 136 insertions(+), 27 deletions(-)
> >>>
>