Re: A problem about DIRECT IO on ext3

From: Jens Axboe
Date: Mon Oct 17 2005 - 12:54:00 EST


On Mon, Oct 17 2005, Badari Pulavarty wrote:
> On Mon, 2005-10-17 at 11:51 +0200, Jens Axboe wrote:
> > On Mon, Oct 17 2005, li nux wrote:
> > >
> > >
> > > --- Jens Axboe <axboe@xxxxxxx> wrote:
> > >
> > > > On Mon, Oct 17 2005, Grzegorz Kulewski wrote:
> > > > > On Mon, 17 Oct 2005, Jens Axboe wrote:
> > > > > >>how to correct this problem ?
> > > > > >
> > > > > >See your buffer address, it's not aligned. You
> > > > need to align that as
> > > > > >well. This is needed because the hardware will
> > > > dma directly to the user
> > > > > >buffer, and to be on the safe side we require the
> > > > same alignment as the
> > > > > >block layer will normally generate for file
> > > > system io.
> > > > > >
> > > > > >So in short, just align your read buffer to the
> > > > same as your block size
> > > > > >and you will be fine. Example:
> > > > > >
> > > > > >#define BS (4096)
> > > > > >#define MASK (BS - 1)
> > > > > >#define ALIGN(buf) (((unsigned long) (buf) +
> > > > MASK) & ~(MASK))
> > > > > >
> > > > > >char *ptr = malloc(BS + MASK);
> > > > > >char *buf = (char *) ALIGN(ptr);
> > > > > >
> > > > > >read(fd, buf, BS);
> > > > >
> > > > > Shouldn't one use posix_memalign(3) for that?
> > > >
> > > > Dunno if one 'should', one 'can' if one wants to. I
> > > > prefer to do it
> > > > manually so I don't have to jump through #define
> > > > hoops to get at it
> > > > (which, btw, still doesn't expose it on this
> > > > machine).
> > > >
> > > > --
> > > > Jens Axboe
> > >
> > > Thanx a lot Jens :-)
> > > Its working now.
> > > I did not have to make these adjustments on 2.6
> > > Is looks to be having more relaxation.
> >
> > 2.6 does have the option of checking the hardware dma requirement
> > seperately, but for this path you should run into the same restrictions.
> > Perhaps you just got lucky when testing 2.6?
>
> 2.6 also has the same restriction. But, if the "filesystem
> blocksize alignment" (soft block size) fails, we try to see
> if its aligned with hard sector size (512). If so, we can do the IO.
>
> 2.4 fails if the offset or buffer is NOT filesystem blocksize
> aligned. Period.

I'm aware of that, however this particular case was about the buffer
alignment (which was 32-bytes in the strace). And that should not work
for 2.6 either.

The block-size alignment is really a separate property of correctness.

> BTW, posix_memalign() or valloc() should be safe.

Certainly.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/