Re: [PATCH] BUG on fsync/fdatasync with Ext3 data=journal

From: Kenichi Okuyama
Date: Sat Sep 18 2004 - 15:50:08 EST


Dear Mr. Morton, Seiji, and all,

>>>>> "AM" == Andrew Morton <akpm@xxxxxxxx> writes:
AM> Yes, the I_DIRTY test is bogus because data pages are not marked dirty at
AM> write() time when the filesystem is mounted in data=journal mode.
AM> However your patch will disable the above optimisation for data=writeback
AM> and data-ordered modes as well. I don't think that's necessary?

I don't think Mr. Morton's code have any advantages over Seiji's patch.


Please look at lines below. Line starting with AM> + are the point
Mr. Morton have added the code ( point where you removed are bit
above, and not in the lines ).


74 if (ext3_should_journal_data(inode)) {
75 ret = ext3_force_commit(inode->i_sb);
76 goto out;
77 }
AM> + smp_mb(); /* prepare for lockless i_state read */
AM> + if (!(inode->i_state & I_DIRTY))
AM> + goto out;
AM> +
78
79 /*
80 * The VFS has written the file data. If the inode is unaltered
81 * then we need not start a commit.
82 */
83 if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
84 struct writeback_control wbc = {
85 .sync_mode = WB_SYNC_ALL,
86 .nr_to_write = 0, /* sys_fsync did this */
87 };
88 ret = sync_inode(inode, &wbc);
89 }
90 out:
91 return ret;

Now. Please note that
#define I_DIRTY (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES)
is definition of macro 'I_DIRTY'. As result, Mr. Morton's patch is
saying that:

if (!(inode->i_state & (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGE))
goto out;
if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
struct writeback_control wbc = {
.sync_mode = WB_SYNC_ALL,
.nr_to_write = 0, /* sys_fsync did this */
};
ret = sync_inode(inode, &wbc);
}
out:


But this is equivalent to following code ( think carefully :-)

if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
struct writeback_control wbc = {
.sync_mode = WB_SYNC_ALL,
.nr_to_write = 0, /* sys_fsync did this */
};
ret = sync_inode(inode, &wbc);
}
out:

whch turns out to be what Seiji's patch was.


Hence, Mr. Morton's patch have no OPTIMIZATION over Seiji's code.
( if gcc is smart enough, Mr. Morton's code should have no effect to
binary. If not, it's overhead. ).


My worry is follows.

Basically, Seiji's patch is better. But in that case,
smp_mb() call right before accessing to inode->i_state will
disappear. Is this safe.....

I am not sure because even without Seiji's patch,
codes at line 83 did exist. And it was working... wasn't it?

In the case smp_mb() was simply not nessasary, Seiji's patch
will do everything. In case smp_mb() was nessasary, we were
lacking one right before line 83.


best regards,
----
Kenichi Okuyama

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/