Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparison document between erlfmt and other erlang formatters #129

Merged
merged 9 commits into from
Sep 7, 2020
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,21 @@ what_is(Erlang) ->

*Disclaimer: erlfmt is just a code formatter, not a solution to all life's problems.*

## Comparison with other Erlang formatters

| |erlfmt |rebar3_format |steamroller |erl_tidy |
|--- |--- |--- |--- |--- |
|File Types |.erl, .hrl, .app, .app.src, .config, .script, .escript |.erl, .hrl |.erl, .hrl, .app, .app.src, .config, .script |.erl |
|Macros |No crashes formatting OTP |Skips entire files sometimes |Skips entire files sometimes |Crashes sometimes |
|Comments |Preserves and moves to line before |Preserves but Floating |Crashes sometimes and Reorders |Crashes sometimes and Floating |
|Configurable vs Opinionated |Opinionated |Configurable |Opinionated |Configurable |
|Preserving Representation |Yes |Some |No |No |
|Line Break Hints |Yes |No |No |No |
|Opt In/Out |per file, per top level expression |No |No |No |
awalterschulze marked this conversation as resolved.
Show resolved Hide resolved
|Speed |OTP lib in 7s |N/A |N/A |N/A |

See the [comparison with other erlang formatters document](./doc/ErlangFormatterComparison.md) for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move this section a bit below, especially below install/usage instructions. What do you think?

Perhaps we need a small ToC at the beginning of the readme?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the references as close as possible to the content, but a TBC is a great idea.
I'll add one in a following pull request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request for TBC #132


## Line length

erlfmt enforces a consistent style by parsing your code and re-printing it,
Expand Down
345 changes: 345 additions & 0 deletions doc/ErlangFormatterComparison.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,345 @@
# Erlang Formatter Comparison

| |erlfmt |rebar3_format |steamroller |erl_tidy |
|--- |--- |--- |--- |--- |
|[File Types](#file-types) |.erl, .hrl, .app, .app.src, .config, .script, .escript |.erl, .hrl |.erl, .hrl, .app, .app.src, .config, .script |.erl |
|[Macros](#macros) |No crashes formatting OTP |Skips entire files sometimes |Skips entire files sometimes |Crashes sometimes |
|[Comments](#comments) |Preserves and moves to line before |Preserves but Floating |Crashes sometimes and Reorders |Crashes sometimes and Floating |
|[Configurable vs Opinionated](#configurable-vs-opinionated)|Opinionated |Configurable |Opinionated |Configurable |
|[Preserving Representation](#preserving-representation) |Yes |Some |No |No |
|[Line Break Hints](#line-break-hints) |Yes |No |No |No |
|[Opt In/Out](#opt-inout) |per file, per top level expression |No |No |No |
awalterschulze marked this conversation as resolved.
Show resolved Hide resolved
|[Speed](#speed) |OTP lib in 7s |N/A |N/A |N/A |

## File Types

`erlfmt` and `streamroller` support all file types `.erl`, `.hrl`, `.app`, `.app.src`, `.config`, `.script` that are common in the Erlang ecosystem, but `erlfmt` can also handle `.escript` files which include a top `#!/usr/bin/env escript` directive, where most other formatters only support `.erl` and `.hrl` files.

## Macros

One of the biggest lacking features with current Erlang formatters in handling of macros. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
One of the biggest lacking features with current Erlang formatters in handling of macros. For example:
One of the biggest lacking features with current Erlang formatters is the handling of macros. For example:


* `erl_tidy`
* Crashes on some macros (`argument`, `?NAME`, and `compute`)
* On some, prints the whole containing function in a huge, single, ugly line (`match`)
* `streamroller`
* Requires that all macros are defined
* Skips the whole file, if it contains a marco it can’t handle.
* `rebar3_format`
* Cannot handle most macros in the example (functions `argument`, `match`, `?NAME` and `compute`)
* Skips the whole file in these cases
* Syntax error on double question mark
* Loses parenthesis in `IMPORTANT_PARENS`, which changes the answer of `?IMPORTANT_PARENS(2 + 1)` from 9 to 5.

`erlfmt` forked the erlang parser to make sure that it can handle macros and can handle all of the following macros and format them.

```erlang
-module(macros).

-include_lib("stdlib/include/assert.hrl").

-define(MACRO(), object).

argument(?MACRO()) -> ok.

match() ->
?assertMatch({ok, Num} when is_integer(Num), get_number()).

-define(IMPORTANT_PARENS(Expr), (Expr) * (Expr)).
-define(RECORD_NAME1, #foo).
-define(RECORD_NAME2, foo).

bar() ->
?IMPORTANT_PARENS(2 + 1),
?RECORD_NAME1{a = 1},
#?RECORD_NAME2{a = 1}.

-define(NAME, name).

?NAME() -> ok.

-define(FALLBACK_CLAUSE(Name), Name(_) -> default_action()).

compute(X) when is_integer(X) -> ok;
?FALLBACK_CLAUSE(compute).
```

Sometimes there are still edge cases, which `erlfmt` cannot format, but it can at least preserve the exact string, without crashing, while formatting all the other functions/attributes/expressions in the file.

```erlang
-define(line, put(line, ?LINE),).

foo() ->
?line compute().
```

If you can find macros where `erlfmt` breaks, please report it, so we can look into it and fix this bug.

## Comments

Another place where some Erlang formatters had problems was with comments. `erlfmt` forked the erlang parser to make sure that it can preserve the positioning of comments.
`rebar3_format` preserves the content of all the comments we tested, including unicode in comments, but given the following input:

```erlang
-type foo() ::
%% comment 1
fun(
(
%% comment 2
...
) ->
%% comment 3
bar()
).

```

`rebar3_format` sometimes moves comments further out of their original scope to the point where they are floating:

```erlang
-type foo() :: fun((...) -> bar()). %% comment 1

%% comment 2

%% comment 3
```

`steamroller` crashes on the following:

```erlang
'try'() ->
try Expr
after
'after'
%% comment
end.
```

Given the following input:

```erlang
comprehension() ->
[
[
%% comment 1
X
]
|| %% comment 2
X <-
%% comment 3
[
%% comment 4
]
%% comment 5
].
```

`steamroller` moves comments out of order:

```erlang
comprehension() ->
[
%% comment 2
%% comment 3
[
%% comment 1
X
] || X <- [
%% comment 4
]
%% comment 5
].
```

`erl_tidy` crashes on unicode and when it crashes removes the file, it also results in floating comments, same as `rebar3_format`’s example.

`erlfmt` moves trailing comments above the line, given the following input:

```erlang
%% Constants
-define(VERSION_CHECK_INTERVAL_MILLIS_DEFAULT, 10000). % Minimum interval between health checks
-define(MAX_WRITE_FAILURES, 3).
```

The trailing comment is moved to the line above and the above comment is moved to make space.

```erlang formatted comments
%% Constants

% Minimum interval between health checks
-define(VERSION_CHECK_INTERVAL_MILLIS_DEFAULT, 10000).
-define(MAX_WRITE_FAILURES, 3).
```

`erlfmt` does not reorder comments and does not crash.

## Configurable vs Opinionated

Other Erlang formatters have several configurable options, which can adjust the format.

`erlfmt` is very opinionated and only a single option that will change the format, `print-width` the maximum line length, which has a default of 100.
See the reasoning behind this decision [here](https://github.com/WhatsApp/erlfmt/blob/master/doc/FormattingDecisionDefaultWidth.md).

## Preserving Representation

During parsing with the default Erlang parser, some information is lost, but since `erlfmt` forked the parser, it can preserve the exact representation of strings, atoms and integers.
For example:

```erlang
1_000_000,

"\x61\x62\x63",

"\x61\x62\x63" "\x41\x42\x43",

[1 | [2 | [3 | []]]],

[1 | [2 | [3 | improper_list]]],

{undefined, 'undefined'}.
```

`rebar3_format` simplifies lists, atoms and strings.

```erlang
1_000_000,

"\x61\x62\x63",

"abcABC",

[1, 2, 3],

[1, 2, 3 | improper_list],

{undefined, undefined}.
```

`steamroller` preserves lists, but simplifies strings, atoms and numbers.

```erlang
1000000,
"abc",
"abc" "ABC",
[1 | [2 | [3 | []]]],
[1 | [2 | [3 | improper_list]]],
{undefined, undefined}.
```

`erl_tidy` simplifies all the examples.

```erlang
1000000,

"abc",

"abcABC",

[1, 2, 3],

[1, 2, 3 | improper_list],

{undefined, undefined}.
```

## Line Break Hints

`erlfmt` tries to fit as much as it can on a single line, but will respect user introduced line breaks hints and empty lines in certain places and preserve them.
For example in lists, where the whole list is broken up, if it contains a line break.

```erlang formatted linehints
-export([
%% public functions
foo/1, foo/2,
bar/1,

%% testing helpers
baz/5
]).

x() ->
Foo = [
short,
list
].
```

`rebar3_format` loses all new lines and moves the comments out of the list, since the list can fit on a single line, completely loosing the information about grouping.

```erlang
-export([foo/1, foo/2, bar/1, baz/5]). %% public functions

%% testing helpers

x() ->
Foo = [short, list].
```

`steamroller` preserves the newlines in the export grouping, since there were comments that forced newlines to exist inside the list, but does not preserve newlines in the short list.

```erlang
-export(
[
%% public functions
foo/1,
foo/2,
bar/1,
%% testing helpers
baz/5
]
).

x() -> Foo = [short, list].
```

`erl_tidy` kept some new lines inside the export statement, we assume because of it originally containing comments.

```erlang
-export([bar/1,
baz/5,
foo/1,
foo/2]). %% public functions

%% testing helpers

x() -> Foo = [short, list].
```

If we reformat with `erl_tidy` again these newlines are also removed, because as expected the comments were no longer inside the export statement.

```erlang
-export([bar/1, baz/5, foo/1, foo/2]).

%% public functions

%% testing helpers

x() -> Foo = [short, list].
```

## Opt In/Out

`erlfmt` is the only Erlang formatter, as far as we know, that allows you to opt-in per file and opt-out per top-level expression.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebar3_format allows you to opt out from formatting, by adding -format ignore. in files, or by adding an ignore configuration option either in rebar.config or when running rebar3 format in the command line with a blob of the files you want not to format.
That applies regardless of the formatter you choose. If you happen to use erlfmt within rebar3_format, you can also use @format, of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarifications regarding rebar3_format.

At some point I think it would also be worth noting that wherever you mention rebar3_format in your text, you're actually referring to the default_formatter within rebar3_format. You can still use any other formatter with rebar3_format if you want to (including, of course, erlfmt).

Very good point. I have added a clarifying sentence to the top, right below the table.


Adding a comment to the top of the file `%% @format` and running `erlfmt` with the `--require-pragma` option, will result in only files that have this comment will be formatted.

Adding a comment `%% erlfmt-ignore` above a top level expression, will skip over this single expression and continue to format the rest of the file.
[Here](https://github.com/WhatsApp/erlfmt/blob/master/doc/FormattingDecisionIgnore.md) you can see the reasoning behind including this option.

## Speed

It was really tough to come up with a way to compare speed, as all other formatters crashes on a large enough test bed.
We are sorry to say that this comparison is not fair.
We ran `erlfmt` on the OTP library, using a MacBook Pro (15-inch, 2019) with a 2.4 GHz 8-Core Intel Core i9 CPU.

```sh
$ time erlfmt -w otp/lib/*/{src,include}/*.{e,h}rl
________________________________________________________
Executed in 6.86 secs fish external
usr time 59.20 secs 1.17 millis 59.20 secs
sys time 7.85 secs 0.93 millis 7.85 secs

$ cat otp/lib/*/{src,include}/*.{e,h}rl | wc -l
1361202 (1.36M)
```