Re: [PATCH v3] Add .editorconfig file for basic formatting

From: Íñigo Huguet
Date: Mon May 08 2023 - 05:00:03 EST


Hi, I finally had a chance to look at this.

On Fri, Apr 14, 2023 at 3:36 PM Miguel Ojeda
<miguel.ojeda.sandonis@xxxxxxxxx> wrote:
>
> On Fri, Apr 14, 2023 at 12:11 PM Íñigo Huguet <ihuguet@xxxxxxxxxx> wrote:
> >
> > EditorConfig is a specification to define the most basic code formatting
> > stuff, and it's supported by many editors and IDEs, either directly or
> > via plugins, including VSCode/VSCodium, Vim, emacs and more.
>
> Thanks -- v3 looks much safer!
>
> To clarify the risks (it would be nice to detail these in the commit message):
>
> - Did you sample files manually or did you automate the search (e.g.
> grepping for spaces/tabs, for LF, etc.) to verify the current rules
> match the files in the kernel tree?

Originally I sampled manually, but I have crafted a script to collect
more data. It's not 100% reliable, but good to get an idea. It reads
the leading whitespaces and if >80% of the lines have one kind of
indentation, it considers that it's the one used in that file. The
results, filtered to show only the relevant ones, are pasted at the
end.

These are some personal conclusions from the script's results:
- .py: although the official and most widely used style in the
community is 4-space indentation, in Linux tree many files use tabs.
What should we do here? 4-space is the clear standard for python...
- .rb: only one file in the whole tree
- .pm: only 3 files in the whole tree
- Files with many different indentations, better not to specify them:
rst, cocci, tc, xsl, manual pages
- Files that we should specify, tab indented: awk, dts, dtsi, dtso, s, S
- Files that we might specify, with preference for tab indenting but
not 100% clear: sh, bash, pl
- Files in tools/perf/scripts/*/bin/*: there is no clear formatting
for any file type, only for .py files that are tab-indented. To get
these results I've run my script from tools/perf/scripts directory.

> - Would it be possible to go further than grepping and apply the
> rules (e.g. trigger a "save") through the entire tree to see whether
> there would be spurious changes?
>
> If that comes out clean (or mostly clean), then we would be fairly
> confident this will not surprise developers (and it would be nice to
> have the script around for future updates of the `.editorconfig`).
>
> Perhaps EditorConfig provides a script to check this already?
> Otherwise perhaps it can be done with editorconfig-core-c or
> editorconfig-vim or directly scripting on a couple editors?

It seems that EditorConfig libraries are only to parse the config
file, and applying the formatting is completely up to the linter or
editor's specific plugin.

I found a cli tool called editorconfig-checker that checks if there
are files that don't respect the formatting from .editorconfig, but it
gives tons of false positives.

Scripting any editor would be the way, but I don't have experience doing that.

> - Are we sure the rules match the output of automated formatters we
> are using? (e.g. for Rust we enforce `rustfmt`, and thus we need to
> ensure the editor does not "fight" the formatter; otherwise developers
> may need to run the formatter more).

I'm only aware of Clang and Rust formatter configs in Linux tree, and
I think this complies with them. Do you know about any other?


SCRIPT FILTERED RESULTS:
Note: files might have "unknown" indentation for different reasons:
- there are different styles in a file
- the file contains only non-indented lines like simple Makefiles
- the script got confused with leading alignment whitespaces that
shouldn't be considered indentation
- others?

(1594 ignored files with unknown extension/shebang)

.0:
tabs: 0
2-space: 3
3-space: 1
4-space: 1
6-space: 0
8-space: 0
unknown: 2

.1:
tabs: 0
2-space: 0
3-space: 1
4-space: 0
6-space: 0
8-space: 0
unknown: 12

.2:
tabs: 0
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 1

.8:
tabs: 1
2-space: 0
3-space: 1
4-space: 2
6-space: 0
8-space: 1
unknown: 5

.S:
tabs: 1253
2-space: 8
3-space: 7
4-space: 2
6-space: 0
8-space: 13
unknown: 47

.awk:
tabs: 10
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 1

.awk (shebang):
tabs: 2
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 0

.awk -f (shebang):
tabs: 1
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 0

.bash (shebang):
tabs: 12
2-space: 0
3-space: 0
4-space: 5
6-space: 0
8-space: 1
unknown: 38

.bconf:
tabs: 7
2-space: 3
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 19

.c:
tabs: 35746
2-space: 10
3-space: 2
4-space: 26
6-space: 0
8-space: 1
unknown: 507

.cocci:
tabs: 7
2-space: 23
3-space: 3
4-space: 4
6-space: 0
8-space: 0
unknown: 35

.dts:
tabs: 2656
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 24

.dtsi:
tabs: 1977
2-space: 1
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 20

.dtso:
tabs: 56
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 1

.h:
tabs: 17546
2-space: 93
3-space: 65
4-space: 220
6-space: 2
8-space: 40
unknown: 6653

.json:
tabs: 0
2-space: 1
3-space: 0
4-space: 5
6-space: 0
8-space: 0
unknown: 0

.pl:
tabs: 28
2-space: 1
3-space: 0
4-space: 2
6-space: 0
8-space: 0
unknown: 18

.py:
tabs: 43
2-space: 14
3-space: 0
4-space: 87
6-space: 0
8-space: 2
unknown: 6

.py (shebang):
tabs: 5
2-space: 0
3-space: 0
4-space: 5
6-space: 0
8-space: 1
unknown: 0

.rb:
tabs: 0
2-space: 1
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 0

.rs:
tabs: 0
2-space: 0
3-space: 0
4-space: 35
6-space: 0
8-space: 0
unknown: 3

.rst:
tabs: 381
2-space: 581
3-space: 392
4-space: 245
6-space: 17
8-space: 35
unknown: 1670

.s:
tabs: 4
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 0

.sh:
tabs: 611
2-space: 24
3-space: 0
4-space: 31
6-space: 1
8-space: 1
unknown: 103

.sh (shebang):
tabs: 26
2-space: 4
3-space: 0
4-space: 2
6-space: 0
8-space: 0
unknown: 14

.sh -x (shebang):
tabs: 1
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 0

.tc:
tabs: 8
2-space: 20
3-space: 0
4-space: 40
6-space: 0
8-space: 0
unknown: 34

.txt:
tabs: 1045
2-space: 103
3-space: 8
4-space: 57
6-space: 2
8-space: 14
unknown: 686

.xsl:
tabs: 2
2-space: 0
3-space: 0
4-space: 0
6-space: 0
8-space: 0
unknown: 8

.yaml:
tabs: 0
2-space: 3279
3-space: 2
4-space: 12
6-space: 2
8-space: 3
unknown: 57

Kconfig:
tabs: 1581
2-space: 1
3-space: 0
4-space: 0
6-space: 0
8-space: 1
unknown: 56

Makefile:
tabs: 823
2-space: 17
3-space: 6
4-space: 3
6-space: 2
8-space: 5
unknown: 2005


>
> Cc'ing Andrew since he applied originally the `.clang-format`.
>
> Cheers,
> Miguel
>


--
Íñigo Huguet