Re: [RFC PATCH 00/18] ceph, rbd: Collapse all the I/O types down to something iov_iter-based

From: Xiubo Li
Date: Sun Aug 27 2023 - 21:32:18 EST



On 8/4/23 21:13, David Howells wrote:
Hi Ilya, Xiubo,

[!] NOTE: This is a preview of a work in progress and doesn't yet fully
compile, let alone actually work!

Here are some patches that (mostly) collapse the different I/O types
(PAGES, PAGELIST, BVECS, BIO) down to a single one. I added a new type,
ceph_databuf, to make this easier. The page list is attached to that as a
bio_vec[] with an iov_iter, but could also be some other type supported by
the iov_iter. The iov_iter defines the data or buffer to be used. I have
an additional iov_iter type implemented that allows use of a straight
folio[] or page[] instead of a bio_vec[] that I can deploy if that proves
more useful.

The conversion isn't quite complete:

(1) rbd is done; BVECS and BIO types are replaced with ceph_databuf.

(2) ceph_osd_linger_request::preply_pages needs switching over to a
ceph_databuf, but I haven't yet managed to work out how the pages that
handle_watch_notify() sticks in there come about.

(3) I haven't altered data transmission in net/ceph/messenger*.c yet. The
aim is to reduce it to a single sendmsg() call for each ceph_msg_data
struct, using the iov_iter therein.

(4) The data reception routines in net/ceph/messenger*.c also need
modifying to pass each ceph_msg_data::iter to recvmsg() in turn.

(5) It might be possible to merge struct ceph_databuf into struct
ceph_msg_data and eliminate the former.

(6) fs/ceph/ still needs some work to clean up the use of page arrays.

(7) I would like to change front and middle buffers with a ceph_databuf,
vmapping them when we need to access them.

I added a kmap_ceph_databuf_page() macro and used that to get a page and
use kmap_local_page() on it to hide the bvec[] inside to make it easier to
replace.

Anyway, if anyone has any thoughts...


I've pushed the patches here also:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-extract

David

David Howells (18):
iov_iter: Add function to see if buffer is all zeros
ceph: Rename alignment to offset
ceph: Add a new data container type, ceph_databuf
ceph: Convert ceph_mds_request::r_pagelist to a databuf
rbd: Use ceph_databuf for rbd_obj_read_sync()
ceph: Change ceph_osdc_call()'s reply to a ceph_databuf
ceph: Unexport osd_req_op_cls_request_data_pages()
ceph: Remove osd_req_op_cls_response_data_pages()

David,

I think the titles should be prefixed with "libceph: XXX" for the patches in net/ceph/ ?

Thanks

- Xiubo


ceph: Convert notify_id_pages to a ceph_databuf
rbd: Switch from using bvec_iter to iov_iter
ceph: Remove bvec and bio data container types
ceph: Convert some page arrays to ceph_databuf
ceph: Convert users of ceph_pagelist to ceph_databuf
ceph: Remove ceph_pagelist
ceph: Convert ceph_osdc_notify() reply to ceph_databuf
ceph: Remove CEPH_OS_DATA_TYPE_PAGES and its attendant helpers
ceph: Remove CEPH_MSG_DATA_PAGES and its helpers
ceph: Don't use data_pages

drivers/block/rbd.c | 645 ++++++++++----------------------
fs/ceph/acl.c | 39 +-
fs/ceph/addr.c | 18 +-
fs/ceph/file.c | 157 ++++----
fs/ceph/inode.c | 85 ++---
fs/ceph/locks.c | 23 +-
fs/ceph/mds_client.c | 134 ++++---
fs/ceph/mds_client.h | 2 +-
fs/ceph/super.h | 8 +-
fs/ceph/xattr.c | 68 ++--
include/linux/ceph/databuf.h | 65 ++++
include/linux/ceph/messenger.h | 141 +------
include/linux/ceph/osd_client.h | 97 ++---
include/linux/ceph/pagelist.h | 72 ----
include/linux/uio.h | 1 +
lib/iov_iter.c | 22 ++
net/ceph/Makefile | 5 +-
net/ceph/cls_lock_client.c | 40 +-
net/ceph/databuf.c | 149 ++++++++
net/ceph/messenger.c | 376 +------------------
net/ceph/osd_client.c | 430 +++++++--------------
net/ceph/pagelist.c | 171 ---------
22 files changed, 876 insertions(+), 1872 deletions(-)
create mode 100644 include/linux/ceph/databuf.h
delete mode 100644 include/linux/ceph/pagelist.h
create mode 100644 net/ceph/databuf.c
delete mode 100644 net/ceph/pagelist.c