Re: Test12 ll_rw_block error.

From: Russell Cattelan (cattelan@thebarn.com)
Date: Sat Dec 16 2000 - 19:51:36 EST

Next message: Russell Cattelan: "Re: Test12 ll_rw_block error."
Previous message: Lukasz Trabinski: "Re: [patch] 2.2.18 PCI_DEVICE_ID_OXSEMI_16PCI954"
Next in thread: Russell Cattelan: "Re: Test12 ll_rw_block error."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Alexander Viro wrote:

> On Thu, 14 Dec 2000, Linus Torvalds wrote:
>
> > Good point.
> >
> > This actually looks fairly nasty to fix. The obvious fix would be to not
> > put such buffers on the dirty list at all, and instead rely on the VM
> > layer calling "writepage()" when it wants to push out the pages.
> > That would be the nice behaviour from a VM standpoint.
> >
> > However, that assumes that you don't have any "anonymous" buffers, which
> > is probably an unrealistic assumption.
> >
> > The problem is that we don't have any per-buffer "writebuffer()" function,
> > the way we have them per-page. It was never needed for any of the normal
> > filesystems, and XFS just happened to be able to take advantage of the
> > b_end_io behaviour.
> >
> > Suggestions welcome.
>
> Just one: any fs that really cares about completion callback is very likely
> to be picky about the requests ordering. So sync_buffers() is very unlikely
> to be useful anyway.

Actually no, that's not the issue.

The XFS log uses a LSN (Log Sequence Number) to keep track of log write ordering.
Sync IO on each log buffer isn't realistic; the performance hit would be to great.

I wasn't around when most of XFS was developed, but from I what I understand it
was discovered early on that firing off writes in a particular order doesn't
guarantee that
they will finish in that order. Thus the implantation of a sequence number for
each log write.

One of the obstacles we ran into early on in the linux port was the fact that
linux used fixed size IO requests to any given device.
But most of XFS's meta data structures vary in size in multiples of 512 bytes.

We were also implementing a page caching / clustering layer called
page_buf which understands primarily pages and not necessary
disk blocks. If your FS block size happens to match your page size then things
are good, but it doesn't....
So we added a bit map field to the pages structure.
Each bit then represents one BASIC BLOCK eg 512 for all practical purposes

The end_io functions XFS defines updates the correct bit or the whole bit array
if the whole page is valid, thus signaling the rest of the page_buf that the io
has
completed.

Ok there is a lot more to it than what I've just described but you probably get
the idea.

>
>
> In that sense we really don't have anonymous buffers here. I seriously
> suspect that "unrealistic" assumption is not unrealistic at all. I'm
> not sufficiently familiar with XFS code to say for sure, but...
>
> What we really need is a way for VFS/VM to pass the pressure on filesystem.
> That's it. If fs wants unusual completions for requests - let it have its
> own queueing mechanism and submit these requests when it finds that convenient.
>
> Stephen, you probably already thought about that area. Could you comment on
> that?
> Cheers,
> Al

-- Russell Cattelan cattelan@thebarn.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/

Next message: Russell Cattelan: "Re: Test12 ll_rw_block error."
Previous message: Lukasz Trabinski: "Re: [patch] 2.2.18 PCI_DEVICE_ID_OXSEMI_16PCI954"
Next in thread: Russell Cattelan: "Re: Test12 ll_rw_block error."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sat Dec 23 2000 - 21:00:17 EST