Re: 1352 NUL bytes at the end of a page? (was Re: Assertion `s &&s->tree' failed: The saga continues.)

From: Linus Torvalds
Date: Mon May 17 2004 - 09:13:49 EST




On Sun, 16 May 2004, Andrew Morton wrote:
>
> i_size is updated in generic_commit_write(), on a per-page basis, or I'm
> missing something? I sure hope so.

It's updated AFTER we drop the page lock. It's not enough to do it
page-per-page, you have to do it protected by the only thing that
block_write_full_page() sees, namely the page lock.

So what can happen is

copy data from user space to page
commit_write()
unlock_page()
**preemption happens**
fsync
block_write_full_page()
doen't see the new i_size, clears the data
**preempt back**
update i_size.

end result: zeroes where the process wrote stuff.

> As for O_DIRECT: I need to think about that a bit more. We hold i_sem and
> have done an fdatasync prior to entering generic_file_aio_write_nolock() so
> there should be no dirty pagecache at this stage anyway.

That just hides the race.

> But I doubt if bk is using direct-IO in combination with MAP_SHARED...

Absolutely. I think the bug happens for the regular case, simply because
nobody is even _using_ direct-IO. But in direct-IO, the race is about a
million times bigger, because we won't actually update i_size until much
much later.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/