Re: [RFC PATCH V2 07/12] fs: Add locking for a dynamic inode 'mode'

From: Darrick J. Wong
Date: Mon Jan 13 2020 - 20:04:36 EST


On Mon, Jan 13, 2020 at 04:20:05PM -0800, Ira Weiny wrote:
> On Mon, Jan 13, 2020 at 02:12:18PM -0800, Darrick J. Wong wrote:
> > On Fri, Jan 10, 2020 at 11:29:37AM -0800, ira.weiny@xxxxxxxxx wrote:
> > > From: Ira Weiny <ira.weiny@xxxxxxxxx>
>
> [snip]
>
> > >
> > > The File Object
> > > ---------------
> > > @@ -437,6 +459,8 @@ As of kernel 2.6.22, the following members are defined:
> > > int (*atomic_open)(struct inode *, struct dentry *, struct file *,
> > > unsigned open_flag, umode_t create_mode);
> > > int (*tmpfile) (struct inode *, struct dentry *, umode_t);
> > > + void (*lock_mode)(struct inode *);
> > > + void (*unlock_mode)(struct inode *);
> >
> > Yikes. "mode" has a specific meaning for inodes, and this lock isn't
> > related to i_mode. This lock protects aops from changing while an
> > address space operation is in use.
>
> Ah... yea ok mode is a bad name.
>
> >
> > > };
> > >
> > > Again, all methods are called without any locks being held, unless
> > > @@ -584,6 +608,12 @@ otherwise noted.
> > > atomically creating, opening and unlinking a file in given
> > > directory.
> > >
> > > +``lock_mode``
> > > + called to prevent operations which depend on the inode's mode from
> > > + proceeding should a mode change be in progress
> >
> > "Inodes can't change mode, because files do not suddenly become
> > directories". ;)
>
> Yea sorry.
>
> >
> > Oh, you meant "lock_XXXX is called to prevent a change in the pagecache
> > mode from proceeding while there are address space operations in
> > progress". So these are really more aops get and put functions...
>
> At first I actually did have aops get/put functions but this is really
> protecting more than the aops vector because as Christoph said there are file
> operations which need to be protected not just address space operations.
>
> But I agree "mode" is a bad name... Sorry...

inode_fops_{get,set}(), then?

inode_start_fileop()
inode_end_fileop() ?

Trying to avoid sounding foppish <COUGH>

> >
> > > +``unlock_mode``
> > > + called when critical mode dependent operation is complete
> > >
> > > The Address Space Object
> > > ========================
> > > diff --git a/fs/ioctl.c b/fs/ioctl.c
> > > index 7c9a5df5a597..ed6ab5303a24 100644
> > > --- a/fs/ioctl.c
> > > +++ b/fs/ioctl.c
> > > @@ -55,18 +55,29 @@ EXPORT_SYMBOL(vfs_ioctl);
> > > static int ioctl_fibmap(struct file *filp, int __user *p)
> > > {
> > > struct address_space *mapping = filp->f_mapping;
> > > + struct inode *inode = filp->f_inode;
> > > int res, block;
> > >
> > > + lock_inode_mode(inode);
> > > +
> > > /* do we support this mess? */
> > > - if (!mapping->a_ops->bmap)
> > > - return -EINVAL;
> > > - if (!capable(CAP_SYS_RAWIO))
> > > - return -EPERM;
> > > + if (!mapping->a_ops->bmap) {
> > > + res = -EINVAL;
> > > + goto out;
> > > + }
> > > + if (!capable(CAP_SYS_RAWIO)) {
> > > + res = -EPERM;
> > > + goto out;
> >
> > Why does the order of these checks change here?
>
> I don't understand? The order does not change we just can't return without
> releasing the lock. And to protect against bmap changing the lock needs to be
> taken first.

Doh. -ENOCOFFEE, I plead.

--D

> [snip]
>
> > >
> > > +static inline void lock_inode_mode(struct inode *inode)
> >
> > inode_aops_get()?
>
> Let me think on this. This is not just getting a reference to the aops vector.
> It is more than that... and inode_get is not right either! ;-P
>
> >
> > > +{
> > > + WARN_ON_ONCE(inode->i_op->lock_mode &&
> > > + !inode->i_op->unlock_mode);
> > > + if (inode->i_op->lock_mode)
> > > + inode->i_op->lock_mode(inode);
> > > +}
> > > +static inline void unlock_inode_mode(struct inode *inode)
> > > +{
> > > + WARN_ON_ONCE(inode->i_op->unlock_mode &&
> > > + !inode->i_op->lock_mode);
> > > + if (inode->i_op->unlock_mode)
> > > + inode->i_op->unlock_mode(inode);
> > > +}
> > > +
> > > static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio,
> > > struct iov_iter *iter)
> >
> > inode_aops_put()?
>
> ... something like that but not 'aops'...
>
> Ira
>
> >
> > --D
> >