Re: [RFC v2a 11/12] net: ceph: use vfs_time data type instead of timespec

From: Dave Chinner
Date: Sat Feb 13 2016 - 17:08:57 EST


On Fri, Feb 12, 2016 at 01:36:05AM -0800, Deepa Dinamani wrote:
> The VFS inode timestamps are not y2038 safe as they use
> struct timespec. These will be changed to use struct timespec64
> instead and that is y2038 safe.
> But, since the above data type conversion will break the end
> file systems, use vfs_time aliases here to access inode times.
>
> These timestamps are passed in as arguments to functions
> using inode timestamps. Hence, these need to change along
> with vfs to support 64 bit timestamps. vfs_time helps do
> this transition.
>
> Signed-off-by: Deepa Dinamani <deepa.kernel@xxxxxxxxx>

Just a point to highlight the problem with this approach:

> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index f8f2359..1273db6 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -2401,7 +2401,7 @@ bad:
> */
> void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off,
> struct ceph_snap_context *snapc, u64 snap_id,
> - struct timespec *mtime)
> + struct vfs_time *mtime)
> {
> struct ceph_msg *msg = req->r_request;
> void *p;

So this change assumes that mtime is not passed by reference to
another function. If we change vfs_time to be a timespec64, then
dereferencing in this function works fine, but passing to another
function will not because that function will be expecting a
timespec.

That, indeed, is what happens here. A few lines into this function:

if (req->r_flags & CEPH_OSD_FLAG_WRITE)
ceph_encode_timespec(p, mtime);

And that function:

static inline void ceph_encode_timespec(struct ceph_timespec *tv,
const struct timespec *ts)
{
tv->tv_sec = cpu_to_le32((u32)ts->tv_sec);
tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec);
}

expects a timespec. It will silently lose 64 bit times even if it
did compile. I note in version 2b, the mtime was not passed by
reference as a vfs time, but converted at the call site to
a timespec and so the internal usage of the timestamp remains
unchanged and unaffected by a VFS level timespec->timespec64 change.

I think an approach that requires changes to the API without
actually beign able to verify they are correct, fully propagated or
don't impact on write/disk formats before the final change of the
VFS type is not going to fly. This is the sort of subtle bug that
can occur with type changes, and hence why I think that the fs
developers should be left to do the conversion of their filesystem
to support 64 bit times (i.e. approach 2b).

Any change is going to take a significant amount of testing and
verification, and that's something we don't have yet. Nobody has
written any tests for xfstests to verify correct 64 bit timestamp
behaviour, nor do we have tests to verify 32 bit timestamp behaviour
on a 64 bit time kernel. These are things that we are going to need;
all filesystems should behave the same w.r.t. these configurations,
so we really do need regression tests for this....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx