Re: [RFC] Thing 1: Shardmap for Ext4

From: Dave Chinner
Date: Sun Dec 08 2019 - 17:43:13 EST


On Thu, Dec 05, 2019 at 09:09:28PM -0800, Daniel Phillips wrote:
> On 2019-12-05 5:16 p.m., Dave Chinner wrote:
> > On Wed, Dec 04, 2019 at 06:41:06PM -0500, Theodore Y. Ts'o wrote:
> >> On Wed, Dec 04, 2019 at 11:31:50AM -0700, Andreas Dilger wrote:
> >>> One important use case that we have for Lustre that is not yet in the
> >>> upstream ext4[*] is the ability to do parallel directory operations.
> >>> This means we can create, lookup, and/or unlink entries in the same
> >>> directory concurrently, to increase parallelism for large directories.
> >>>
> >>> [*] we've tried to submit the pdirops patch a couple of times, but the
> >>> main blocker is that the VFS has a single directory mutex and couldn't
> >>> use the added functionality without significant VFS changes.
> >>> Patch at https://git.whamcloud.com/?p=fs/lustre-release.git;f=ldiskfs/kernel_patches/patches/rhel8/ext4-pdirop.patch;hb=HEAD
> >>>
> >>
> >> The XFS folks recently added support for parallel directory operations
> >> into the VFS, for the benefit of XFS has this feature.
> >
> > The use of shared i_rwsem locking on the directory inode during
> > lookup/pathwalk allows for concurrent lookup/readdir operations on
> > a single directory. However, the parent dir i_rwsem is still held
> > exclusive for directory modifications like create, unlink, etc.
> >
> > IOWs, the VFS doesn't allow for concurrent directory modification
> > right now, and that's going to be the limiting factor no matter what
> > you do with internal filesystem locking.
>
> On a scale of 0 to 10, how hard do you think that would be to relax
> in VFS, given the restriction of no concurrent inter-directory moves?

My initial reaction is to run away screaming in horror. Beyond that,
I have no idea what terrible dangers lurk in the dark shadows where
mortals fear to tread...

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx