Re: (reiserfs) Re: reiserfs and knfsd and NFSv4 and volatile file handles

From: Hans Reiser (reiser@idiom.com)
Date: Mon Mar 20 2000 - 07:14:07 EST


Chris Mason wrote:
>
> [ NFS list removed from the cc ]
>
> On Sat, 18 Mar 2000, Hans Reiser wrote:
>
> > I leave the below for Alexei and Vladimir to answer in detail. Generally
> > speaking, we can only do clean effective SMP after implementing per buffer
> > seals. I see that as unlikely to make it into 2.4 due to it requiring a per
> > buffer/page fs specific struct, but it is actively being coded by Zarochentcev
> > and Roma. If 2.4 is slow enough in coming out, maybe we will submit it just in
> > case Linus will take it. Our SMP will be poor until we have it.
> >
>
> Our reads are hit the most by this, as ext2 can read without the big
> kernel lock held, and we can't. But, Hans, I thought we had some unused
> bits in the block_head struct that could be used to make the seals. That
> would allow us to fully thread th

Ummh, I am glad you remember my ideas better than I do.:-)

Ok, but this must in any event wait for Alexei to finish the VFS audit, because
we have no free for it at this time, as Roma has indicated he doesn't have time
available right now for it, and Alexei is the only person available unless I
give the task to zam.

> e tree access, without changing the
> buffer head or page structs.
>
> > > Umm... I'd still like to hear a description of your internal locking. In
> > > particular, what do you lock/release upon reiserfs_find_entry()/pathrelse()?
> > > When do you rebalance the tree? How do you do serialization between that
> > > and things a-la write_inode()/readdir()/lookup()? Currently locking is
> > > masked by the VFS one and that's one of the reasons why I want to see
> > > cleaned variant.
> >
> Well, it is a bit ugly. The tree is balanced anytime things are added,
> removed, or resized. Sometimes this doesn't actually require shifting
> data around, we have early exits for that sort of thing. Vladimir, please
> correct me if I'm wrong here:
>
> We are acting inside the big kernel lock, and we build a struct of all the
> nodes that need to be a part of the balance. If we schedule while
> building that struct, we check to see if another balance happened during
> the schedule (via generation counter). Once we've gathered the
> information required, the balance is done, in one big schedule free loop.
>
> Simply put, this sucks. I would really like to see the per buffer locks,
> either with free bits in our on disk structures, or with a locking bit in
> the buffer head (the existing one, or one we add). Then nodes could be
> locked in tree depth order, and our balancing code could be much
> cleaner.

For those not familiar with the intended seal strategy, first we will seal all
nodes and gather them in RAM, then we will replace each seal with a lock with us
checking the seal after placing each lock and recomputing if it is broken, and
then we balance them, and then we unlock. Locks are ordered in their
acquisition, how ordered doesn't matter so long as it is consistent. Seals are
broken by any write to a node.

Hans

-- 
You can get ReiserFS at http://devlinux.org/namesys, and customizations and
industrial grade support at reiser@idiom.com.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:31 EST