Re: xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?)

From: Howard Chu
Date: Thu Apr 30 2015 - 12:01:00 EST


Daniel Phillips wrote:
On 04/30/2015 07:28 AM, Howard Chu wrote:
You're reading into it what isn't there. Spreading over the disk isn't (just) about avoiding
fragmentation - it's about delivering consistent and predictable latency. It is undeniable that if
you start by only allocating from the fastest portion of the platter, you are going to see
performance slow down over time. If you start by spreading allocations across the entire platter,
you make the worst-case and average-case latency equal, which is exactly what a lot of folks are
looking for.

Another fallacy: intentionally running slower than necessary is not necessarily
the only way to deliver consistent and predictable latency.

Totally agree with you there.

Not only that, but
intentionally running slower than necessary does not necessarily guarantee
performing better than some alternate strategy later.

True, it's a question of algorithmic efficiency - does the performance decay linearly or logarithmically.

Anyway, let's not be silly. Everybody in the room who wants Git to run 4 times
slower with no guarantee of any benefit in the future, please raise your hand.

git is an important workload for us as developers, but I don't think that's the only workload that's important for us.

He flat stated that xfs has passable performance on
single bit of rust, and openly explained why. I see no misdirection,
only some evidence of bad blood between you two.

Raising the spectre of theoretical fragmentation issues when we have not
even begun that work is a straw man and intellectually dishonest. You have
to wonder why he does it. It is destructive to our community image and
harmful to progress.

It is a fact of life that when you change one aspect of an intimately interconnected system,
something else will change as well. You have naive/nonexistent free space management now; when you
design something workable there it is going to impact everything else you've already done. It's an
easy bet that the impact will be negative, the only question is to what degree.

You might lose that bet. For example, suppose we do strictly linear allocation
each delta, and just leave nice big gaps between the deltas for future
expansion. Clearly, we run at similar or identical speed to the current naive
strategy until we must start filling in the gaps, and at that point our layout
is not any worse than XFS, which started bad and stayed that way.

Now here is where you lose the bet: we already know that linear allocation
with wrap ends horribly right? However, as above, we start linear, without
compromise, but because of the gaps we leave, we are able to switch to a
slower strategy, but not nearly as slow as the ugly tangle we get with
simple wrap. So impact over the lifetime of the filesystem is positive, not
negative, and what seemed to be self evident to you turns out to be wrong.

In short, we would rather deliver as much performance as possible, all the
time. I really don't need to think about it very hard to know that is what I
want, and what most users want.

I will make you a bet in return: when we get to doing that part properly, the
quality of the work will be just as high as everything else we have completed
so far. Why would we suddenly get lazy?

I never said anything about getting lazy. You're working in a closed system though. If you run today's version on a system, and then you run your future version on that same hardware, you're doing more CPU work and probably more I/O work to do the more complex space management. It's not quite zero-sum but close enough, when you're talking about highly optimized designs.

--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/