Re: [FYI] tux3: Core changes

From: David Lang
Date: Tue May 19 2015 - 16:34:06 EST


On Tue, 19 May 2015, Daniel Phillips wrote:

I understand that Tux3 may avoid these issues due to some other mechanisms
it internally has but if page forking should get into mm subsystem, the
above must work.

It does work, and by example, it does not need a lot of code to make
it work, but the changes are not trivial. Tux3's delta writeback model
will not suit everyone, so you can't just lift our code and add it to
Ext4. Using it in Ext4 would require a per-inode writeback model, which
looks practical to me but far from a weekend project. Maybe something
to consider for Ext5.

It is the job of new designs like Tux3 to chase after that final drop
of performance, not our trusty Ext4 workhorse. Though stranger things
have happened - as I recall, Ext4 had O(n) directory operations at one
time. Fixing that was not easy, but we did it because we had to. Fixing
Ext4's write performance is not urgent by comparison, and the barrier
is high, you would want jbd3 for one thing.

I think the meta-question you are asking is, where is the second user
for this new CoW functionality? With a possible implication that if
there is no second user then Tux3 cannot be merged. Is that is the
question?

I don't think they are asking for a second user. What they are saying is that for this functionality to be accepted in the mm subsystem, these problem cases need to work reliably, not just work for Tux3 because of your implementation.

So for things that you don't use, you need to make it an error if they get used on a page that's been forked (or not be an error and 'do the right thing')

For cases where it doesn't matter because Tux3 controls the writeback, and it's undefined in general what happens if writeback is triggered twice on the same page, you will need to figure out how to either prevent the second writeback from triggering if there's one in process, or define how the two writebacks are going to happen so that you can't end up with them re-ordered by some other filesystem.

I think that that's what's meant by the top statement that I left in the quote. Even if your implementation details make it safe, these need to be safe even without your implementation details to be acceptable in the core kernel.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/