Re: [PATCH v4 00/15] Fast kernel headers: split linux/mm.h

From: Max Kellermann
Date: Tue Mar 12 2024 - 14:11:45 EST


On Tue, Mar 12, 2024 at 5:33 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
> Just curious: why? Usually build time, do you have some numbers?

(This has been discussed so often and I thought having smaller header
dependencies was already established as general consensus among kernel
devs, and everybody agreed that mm.h and kernel.h are a big mess that
needs cleanup.)

Why: Correctness, less namespace pollution, and the least important
aspect is reduced build times. However, the gains by this patch set
are very small; each optimization gains very little.

Some time ago, I posted a much larger patch set with a few numbers:
https://lore.kernel.org/lkml/20240131145008.1345531-1-max.kellermann@xxxxxxxxx/
- the speedup was measurable, but not amazing, because even after that
patch set, everybody still includes everything, and much more cleanup
is needed to make a bigger difference. Once the big knots like
kernel.h and mm.h are broken up, we will have better results. And as I
said: build times are nice, but the lesser advantage.

Anyway, this first patch set was so extremely large that nobody was
able to review it. So this patch contains just the parts that deal
with mm.h; later, when this patch set is merged, I can continue with
other headers. (I already have a branch that splits kernel.h and I'll
submit it eventually, after the mm.h cleanup.)

> I'm not against splitting out stuff. But one function per header is a
> bit excessive IMHO.

One function per header is certainly not my goal and I agree it's
excessive; but folio_next() in its own header made sense, just in this
special case, because it allowed removing the mm.h dependency from
bio.h. Removing this dependency was the goal, and folio_next.h was the
solution for this particular problem.

> Ideally, we'd have some MM guideline that we'll be
> able to follow in the future. So we don't need "personal taste".

Agree. But lacking such a guideline, all I can do is make suggestions
and submit patches for review, trying to follow what seemed to be
consensus in previous cleanups and what had previously been merged.

> For example, if I were to write a folio_prev(), should I put it in
> include/linux/mm/folio_prev.h ? Likely we'd put it into folio_next.h,
> but then the name doesn't make any sense.

True. But since no folio_prev() function exists, the only name that
made sense for this header was folio_next.h. If folio_prev() gets
added, I'd say put it in that header, but rename it to, let's say,
"folio_iterator.h". But right now, with just this one function (and
nothing else like it that could be added here), I decided to suggest
naming it "folio_next.h". If you think "folio_iterator.h" or something
else is better, I'll rename and resubmit; names don't matter so much
to me; the general idea of cleaning up the headers is what's
important.

> The point I am trying to make: if there was a single folio_ops.h, it
> would be clearer where it would go.

Maybe, but IMO this wouldn't be a good place to add folio_next(),
because folio_next() doesn't "operate" on a folio, but on a folio
pointer (or "iterator"). Again, names don't matter to me -
ideas/concepts matter. I'm only explaining why I decided to submit my
patches that way, but I'll change them any way you people want.

Max