Re: [RFC v2 14/14] xfs: enable block size larger than page size support

From: Pankaj Raghav (Samsung)
Date: Wed Feb 14 2024 - 11:42:45 EST


> > struct xfs_inode *ip;
> > + int min_order = 0;
> >
> > /*
> > * XXX: If this didn't occur in transactions, we could drop GFP_NOFAIL
> > @@ -88,7 +89,8 @@ xfs_inode_alloc(
> > /* VFS doesn't initialise i_mode or i_state! */
> > VFS_I(ip)->i_mode = 0;
> > VFS_I(ip)->i_state = 0;
> > - mapping_set_large_folios(VFS_I(ip)->i_mapping);
> > + min_order = max(min_order, ilog2(mp->m_sb.sb_blocksize) - PAGE_SHIFT);
> > + mapping_set_folio_orders(VFS_I(ip)->i_mapping, min_order, MAX_PAGECACHE_ORDER);
>
> That's pretty nasty. You're using max() to hide underflow in the
> subtraction to clamp the value to zero. And you don't need ilog2()
> because we have the log of the block size in the superblock already.
>
> int min_order = 0;
> .....
> if (mp->m_sb.sb_blocksize > PAGE_SIZE)
> min_order = mp->m_sb.sb_blocklog - PAGE_SHIFT;
how is it underflowing if I am comparing two values of type int?

>
> But, really why recalculate this -constant- on every inode
> allocation? That's a very hot path, so this should be set in the
> M_IGEO(mp) structure (mp->m_ino_geo) at mount time and then the code
> is simply:
>
> mapping_set_folio_orders(VFS_I(ip)->i_mapping,
> M_IGEO(mp)->min_folio_order, MAX_PAGECACHE_ORDER);
>

That is a good idea. I will add this change in the next revision.

> We already access the M_IGEO(mp) structure every inode allocation,
> so there's little in way of additional cost here....
>
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 5a2512d20bd0..6a3f0f6727eb 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1625,13 +1625,11 @@ xfs_fs_fill_super(
> > goto out_free_sb;
> > }
> >
> > - /*
> > - * Until this is fixed only page-sized or smaller data blocks work.
> > - */
> > - if (mp->m_sb.sb_blocksize > PAGE_SIZE) {
> > + if (!IS_ENABLED(CONFIG_XFS_LBS) && mp->m_sb.sb_blocksize > PAGE_SIZE) {
> > xfs_warn(mp,
> > "File system with blocksize %d bytes. "
> > - "Only pagesize (%ld) or less will currently work.",
> > + "Only pagesize (%ld) or less will currently work. "
> > + "Enable Experimental CONFIG_XFS_LBS for this support",
> > mp->m_sb.sb_blocksize, PAGE_SIZE);
> > error = -ENOSYS;
> > goto out_free_sb;
>
> This should just issue a warning if bs > ps.
>
> if (mp->m_sb.sb_blocksize > PAGE_SIZE) {
> xfs_warn(mp,
> "EXPERIMENTAL: Filesystem with Large Block Size (%d bytes) enabled.",
> mp->m_sb.sb_blocksize);
> }

Yes! Luis already told me to add a warning here but I missed it before
sending the patches out.

>
> -Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx