Re: xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?)

From: Christian Stroetmann
Date: Tue May 12 2015 - 17:31:47 EST


On 12.05.2015 22:54, Daniel Phillips wrote:
On 05/12/2015 11:39 AM, David Lang wrote:
On Mon, 11 May 2015, Daniel Phillips wrote:
...it's the mm and core kernel developers that need to
review and accept that code *before* we can consider merging tux3.
Please do not say "we" when you know that I am just as much a "we"
as you are. Merging Tux3 is not your decision. The people whose
decision it actually is are perfectly capable of recognizing your
agenda for what it is.

http://www.phoronix.com/scan.php?page=news_item&px=MTA0NzM
"XFS Developer Takes Shots At Btrfs, EXT4"
umm, Phoronix has no input on what gets merged into the kernel. they also hae a reputation for
trying to turn anything into click-bait by making it sound like a fight when it isn't.
Perhaps you misunderstood. Linus decides what gets merged. Andrew
decides. Greg decides. Dave Chinner does not decide, he just does
his level best to create the impression that our project is unfit
to merge. Any chance there might be an agenda?

Phoronix published a headline that identifies Dave Chinner as
someone who takes shots at other projects. Seems pretty much on
the money to me, and it ought to be obvious why he does it.

Maybe Dave has convincing arguments, that have been misinterpreted by that website, which is an interesting but also highliy manipulative publication.

The real question is, has the Linux development process become
so political and toxic that worthwhile projects fail to benefit
from supposed grassroots community support. You are the poster
child for that.
The linux development process is making code available, responding to concerns from the experts in
the community, and letting the code talk for itself.
Nice idea, but it isn't working. Did you let the code talk to you?
Right, you let the code talk to Dave Chinner, then you listen to
what Dave Chinner has to say about it. Any chance that there might
be some creative licence acting somewhere in that chain?

We are missing the complete useable thing.

There have been many people pushing code for inclusion that has not gotten into the kernel, or has
not been used by any distros after it's made it into the kernel, in spite of benchmarks being posted
that seem to show how wonderful the new code is. ReiserFS was one of the first, and part of what
tarnished it's reputation with many people was how much they were pushing the benchmarks that were
shown to be faulty (the one I remember most vividly was that the entire benchmark completed in<30
seconds, and they had the FS tuned to not start flushing data to disk for 30 seconds, so the entire
'benchmark' ran out of ram without ever touching the disk)
You know what to do about checking for faulty benchmarks.

So when Ted and Dave point out problems with the benchmark (the difference in behavior between a
single spinning disk, different partitions on the same disk, SSDs, and ramdisks), you would be
better off acknowledging them and if you can't adjust and re-run the benchmarks, don't start
attacking them as a result.
Ted and Dave failed to point out any actual problem with any
benchmark. They invented issues with benchmarks and promoted those
as FUD.

In general, benchmarks are a critical issue. In this relation, let me quote Churchill in a derivated way:
Do not trust a benchmark that you have not forged yourself.

As Dave says above, it's not the other filesystem people you have to convince, it's the core VFS and
Memory Mangement folks you have to convince. You may need a little benchmarking to show that there
is a real advantage to be gained, but the real discussion is going to be on the impact that page
forking is going to have on everything else (both in complexity and in performance impact to other
things)
Yet he clearly wrote "we" as if he believes he is part of it.

Now that ENOSPC is done to a standard way beyond what Btrfs had
when it was merged, the next item on the agenda is writeback. That
involves us and VFS people as you say, and not Dave Chinner, who
only intends to obstruct the process as much as he possibly can. He
should get back to work on his own project. Nobody will miss his
posts if he doesn't make them. They contribute nothing of value,
create a lot of bad blood, and just serve to further besmirch the
famously tarnished reputation of LKML.

At least, I would miss his contributions, specifically his technical explanations but also his opinions.

You know that Tux3 is already fast. Not just that of course. It
has a higher standard of data integrity than your metadata-only
journalling filesystem and a small enough code base that it can
be reasonably expected to reach the quality expected of an
enterprise class filesystem, quite possibly before XFS gets
there.
We wouldn't expect anyone developing a new filesystem to believe any differently.
It is not a matter of belief, it is a matter of testable fact. For
example, you can count the lines. You can run the same benchmarks.

Proving the data consistency claims would be a little harder, you
need tools for that, and some of those aren't built yet. Or, if you
have technical ability, you can read the code and the copious design
material that has been posted and convince yourself that, yes, there
is something cool here, why didn't anybody do it that way before?
But of course that starts to sound like work. Debating nontechnical
issues and playing politics seems so much more like fun.

If they didn't
believe this, why would they be working on the filesystem instead of just using an existing filesystem.
Right, and it is my job to convince you that what I believe for
perfectly valid, demonstrable technical reasons, is really true. I do
not see why you feel it is your job to convince me that the obviously
broken Linux community process is not in fact broken, and that a
certain person who obviously has an agenda, is not actually obstructing.

The ugly reality is that everyone's early versions of their new filesystem looks really good. The
problem is when they extend it to cover the corner cases and when it gets stressed by real-world (as
opposed to benchmark) workloads. This isn't saying that you are wrong in your belief, just that you
may not be right, and nobody will know until you are to a usable state and other people can start
beating on it.
With ENOSPC we are at that state. Tux3 would get more testing and advance
faster if it was merged. Things like ifdefs, grandiose new schemes for
writeback infrastructure, dumb little hooks in the mkwrite path, those
are all just manufactured red herrings. Somebody wanted those to be
issues, so now they are issues. Fake ones.

Nobody is trying to trick you. Just stating a fact. You ought to be able
to figure out by now that Tux3 is worth merging.

You might possibly have an argument that merging a filesystem that
crashes as soon as it fills the disk is just sheer stupidity than can
only lead to embarrassment in the long run, but then you would need to
explain why Btrfs was merged. As I recall, it went something like, Chris
had it on a laptop, so it must be a filesystem, and wow look at that
feature list. Then it got merged in a completely unusable state and got
worked on. If it had not been merged, Btrfs would most likely be dead
right now. After all, who cares about an out of tree filesystem?

I would like to say two points to this statement:
Firstly, Btrfs was supported by Oracle, which is definitely a totally different size than a small group of developers.
Secondly, you are right with your complains. Said this, we do not want to make the same mistake with Tux3 or any other file system once again.


By the way, I gave my Tux3 presentation at SCALE 7x in Los Angeles in
2009, with Tux3 running as my root filesystem. By the standard applied
to Btrfs, Tux3 should have been merged then, right? After all, our
nospace handling worked just as well as theirs at that time.

As far as I can remember from the posts on the mailing list, Tux3 has changed so significantly in the last 6 years with features that I always reference, that it cannot be the same compared with what has been presented in 2009 anymore.


Regards,

Daniel

Thanks
Best regards
Have fun
C.S.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/