Skip to content

clang-format sandbox #15226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

clang-format sandbox #15226

wants to merge 4 commits into from

Conversation

nikola-matic
Copy link
Collaborator

@nikola-matic nikola-matic commented Jun 28, 2024

This is a work in progress; each commit is to introduce a new rule in .clang-format, with said rule applied to both TypeChecker.cpp and TypeChecker.h for easier review. Suggestions are welcome.

chk_coding_style step should also be adapted to run clang-format on the pushed commit locally (in the context of the CI), and then compare changes on the remote, and error out if there's a difference. This is of course to be done once clang-format is applied to the entire codebase.

@nikola-matic nikola-matic changed the title Clang format sandbox clang-format sandbox Jun 28, 2024
Comment on lines +1844 to +1847
argArrayType->location() != DataLocation::Storage
|| ((resultArrayType->isPointer()
|| (argArrayType->isByteArrayOrString() && resultArrayType->isByteArrayOrString()))
&& resultArrayType->location() == DataLocation::Storage),
Copy link
Member

@clonker clonker Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a bit funky but perhaps that's less a problem of the formatting but more an issue of it already being a quite complex expression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this one is a big nope. The original was complex but perfectly readable.

The biggest problem here is that the new version does not put the closing parenthesis on a new line, at the right indentation level, it just sticks it at the end. Then it does something weird with indentation, using a mix of tabs and spaces. I'm not sure how to interpret it actually, since it does not seem to follow nesting.

IMO anything that mixes spaces and tabs for indents is a big regression compared to our current style. It will only look right with a specific tab width (which kinda defeats the purpose of using tabs in the first place). Our original style is also much simpler to write, since it never forces you to fiddle with spaces for perfect alignment. Next level is always just one more tab. I'd really prefer to preserve that property.

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through most of it. Honestly, I expected worse. I could probably live with most of it. There are some smaller things that would be nice adjust, but for now I focused on the bigger ones.

The thing that is a deal breaker for me is the weird alignment. The formatter insists on adding spaces after tabs to align things. Our current style is much simpler in that regard and only uses tabs. Now code will look worse with non-standard tab width, and will also be harder to write.

I'm also not really happy with how it handles closing parentheses in ifs and expressions.

Comment on lines +1844 to +1847
argArrayType->location() != DataLocation::Storage
|| ((resultArrayType->isPointer()
|| (argArrayType->isByteArrayOrString() && resultArrayType->isByteArrayOrString()))
&& resultArrayType->location() == DataLocation::Storage),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me this one is a big nope. The original was complex but perfectly readable.

The biggest problem here is that the new version does not put the closing parenthesis on a new line, at the right indentation level, it just sticks it at the end. Then it does something weird with indentation, using a mix of tabs and spaces. I'm not sure how to interpret it actually, since it does not seem to follow nesting.

IMO anything that mixes spaces and tabs for indents is a big regression compared to our current style. It will only look right with a specific tab width (which kinda defeats the purpose of using tabs in the first place). Our original style is also much simpler to write, since it never forces you to fiddle with spaces for perfect alignment. Next level is always just one more tab. I'd really prefer to preserve that property.

Comment on lines -2427 to +2306
auto [errorId, description] = [&]() -> std::tuple<ErrorId, std::string> {
std::string msg = isVariadic ?
"Need at least " +
toString(parameterTypes.size()) +
" arguments for " +
std::string(isStructConstructorCall ? "struct constructor" : "function call") +
", but provided only " +
toString(arguments.size()) +
"."
:
"Wrong argument count for " +
std::string(isStructConstructorCall ? "struct constructor" : "function call") +
": " +
toString(arguments.size()) +
" arguments given but " +
std::string(isVariadic ? "need at least " : "expected ") +
toString(parameterTypes.size()) +
".";
bool const isStructConstructorCall = functionCallKind == FunctionCallKind::StructConstructorCall;

auto [errorId, description] = [&]() -> std::tuple<ErrorId, std::string>
{
std::string msg = isVariadic
? "Need at least " + toString(parameterTypes.size()) + " arguments for "
+ std::string(isStructConstructorCall ? "struct constructor" : "function call")
+ ", but provided only " + toString(arguments.size()) + "."
: "Wrong argument count for "
+ std::string(isStructConstructorCall ? "struct constructor" : "function call")
+ ": " + toString(arguments.size()) + " arguments given but "
+ std::string(isVariadic ? "need at least " : "expected ")
+ toString(parameterTypes.size()) + ".";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another case, where you need to fiddle with spaces to align things the new way. I'm not even sure what it is trying to align things with (why aren't ? and : aligned with =)?

It shifts code to the right for no good reason which will make it taller and narrower - I guess in extreme cases the code could even get squished into a narrow column with one word per line and we could do nothing about it?

Finally, it will also be pretty annoying to write by hand and introduce more noise in diffs - you'll need to adjust the shift whenever the variable or type name changes.

Comment on lines -2241 to +2125
default:
msg += " Cannot use special function.";
case FunctionType::Kind::Internal:
msg += " Provided internal function.";
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that no one on the team is really a fan of our current switch nesting style so we could change it if the formatter allows for it.

Comment on lines -2469 to +2333
msg +
" This function requires a single bytes argument."
" If all your arguments are value types, you can use"
" abi.encode(...) to properly generate it."
msg
+ " This function requires a single bytes argument."
" If all your arguments are value types, you can use"
" abi.encode(...) to properly generate it."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ambivalent about where to put the operator (end of previous line or start of the new one), but the thing I don't like here is, again, the fact that putting it on a new line seems to introduce alignment with spaces.

Maybe keeping operators at the end would prevent this?

Comment on lines -50 to +53
TypeChecker(langutil::EVMVersion _evmVersion, langutil::ErrorReporter& _errorReporter):
m_evmVersion(_evmVersion),
m_errorReporter(_errorReporter)
{}
TypeChecker(langutil::EVMVersion _evmVersion, langutil::ErrorReporter& _errorReporter)
: m_evmVersion(_evmVersion), m_errorReporter(_errorReporter)
{
}
Copy link
Member

@cameel cameel Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this look with more members? Here, again my concern is that by moving the colon to the new line we'll have to always align things with it using spaces.

Comment on lines -510 to +480
if (
auto const* functionType = dynamic_cast<FunctionType const*>(_variable.type());
functionType && functionType->kind() == FunctionType::Kind::External
)
m_errorReporter.typeError(3366_error, _variable.location(), "Immutable variables of external function type are not yet supported.");
m_errorReporter
.typeError(6377_error, _variable.location(), "Immutable variables cannot have a non-value type.");
if (auto const* functionType = dynamic_cast<FunctionType const*>(_variable.type());
functionType && functionType->kind() == FunctionType::Kind::External)
m_errorReporter.typeError(
3366_error, _variable.location(), "Immutable variables of external function type are not yet supported."
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it's not putting the closing ) on a new line makes it much harder to see where the condition ends and the body of the function starts. IMO it's a big regression.

Copy link
Member

@ekpyron ekpyron Aug 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually looked into this a bit last night - and wrote a patch to clang-format adding a AlwaysBlockIndent mode (https://gist.github.com/ekpyron/07d3796f27527b41784546df20200fda) that'd behave at least more like we're used to here (not that I can be sure if it's done correctly and doesn't have unintended side-effects) - they do specifically exclude control flow constructs from the BlockIndent mechanism for some reason.

But even with that patch, it still doesn't break after the if ( parentheses (for for or while it works), with ContinuationIndentWidth: 4 - with ContinuationIndentWidth: 2 it seems to work (with the patch) - which made me realize: the rationale for our coding style there is rather weak :-).

With a continuation indentation width of 4 spaces (or a 4-space tab), something like

if (veryLongStatement...)

being transformed to

if (
    veryLongStatement...
)

actually doesn't help in the first instance, since veryLongStatement still starts at column 4, so this just adds line breaks without shortening the lines :-). I.e. that's the reason why clang-format is hard to convince to do this - it'll only transform to an improved situation apparently and I haven't found where that's checked (their code is kinda horrible to read without knowing it there - I tried to add some mustBreak thing at the opening if-parenthesis in case the rest of the line will need any breaks, but so far failed getting that to work).

Still I actually find

if (auto const* functionType = dynamic_cast<FunctionType const*>(_variable.type());
        functionType && functionType->kind() == FunctionType::Kind::External
)

weirder than

if (
    auto const* functionType = dynamic_cast<FunctionType const*>(_variable.type());
    functionType && functionType->kind() == FunctionType::Kind::External
)

By the way: related:

llvm/llvm-project#79176
llvm/llvm-project#80123
llvm/llvm-project#77699
llvm/llvm-project#67738

Not sure if we should try to pursue this further.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Their tests are also hideous. If you feel like it'll be worthwhile, go ahead, but I'm honestly quite pessimistic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, even if we did end up with a PR to llvm, that'll still take time to merge (if it's even accepted), to get released, to get packaged... we could use a patched static clang-format in the interim, but yeah... I'd rather take this as evidence that it'd be much easier to try and find a style that's fine with us without this property for now - but you can't say I haven't tried to make this work :-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, the first time we tried this years ago, the current BlockIndent didn't even exist - llvm/llvm-project#77699 did get merged earlier this year - it's probably not impossible to get this to work. But even if so, I'd rather say we use some other style now and change this part again, once this were to be supported rather than taking this as reason to delay this any further.

Copy link
Member

@ekpyron ekpyron Aug 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did get it to work properly, though - wasn't that complicated in the end, the crucial part was just adding

+  if (Style.AlignAfterOpenBracket == FormatStyle::BAS_AlwaysBlockIndent &&
+      Previous.is(tok::l_paren) &&
+      (getLengthToMatchingParen(Previous, State.Stack) + State.Column - 1 >
+       getColumnLimit(State)))
+    return true;

to their "must-break" logic - i.e. just always break at parens if the line gets too long without considering anything else.

And it does get us significantly closer to the current style... BreakBinaryOperations: RespectPrecedence of unreleased clang 20 also helps a bit and I tweaked a few more options in:
https://github.com/ethereum/solidity/compare/patchedClangFormat

Might be interesting to at least see how llvm/clang reacts if we propose this as a patch - would need some more time to write some tests for it and such, though. They do understandably have rather strict guidelines against this stuff: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#adding-additional-style-options - but we might just qualify with this...

Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jul 13, 2024
@nikola-matic nikola-matic removed the stale The issue/PR was marked as stale because it has been open for too long. label Jul 13, 2024
Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Jul 28, 2024
@nikola-matic nikola-matic removed the stale The issue/PR was marked as stale because it has been open for too long. label Jul 28, 2024
Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Aug 12, 2024
@nikola-matic nikola-matic removed the stale The issue/PR was marked as stale because it has been open for too long. label Aug 12, 2024
@ekpyron ekpyron added the 🟡 PR review label label Aug 16, 2024
Copy link

github-actions bot commented Sep 1, 2024

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Sep 1, 2024
Copy link

github-actions bot commented Sep 8, 2024

This pull request was closed due to a lack of activity for 7 days after it was stale.

@github-actions github-actions bot closed this Sep 8, 2024
@nikola-matic nikola-matic reopened this Sep 8, 2024
@github-actions github-actions bot removed closed-due-inactivity stale The issue/PR was marked as stale because it has been open for too long. labels Sep 9, 2024
Copy link

This pull request is stale because it has been open for 14 days with no activity.
It will be closed in 7 days unless the stale label is removed.

@github-actions github-actions bot added the stale The issue/PR was marked as stale because it has been open for too long. label Sep 24, 2024
Copy link

github-actions bot commented Oct 1, 2024

This pull request was closed due to a lack of activity for 7 days after it was stale.

@github-actions github-actions bot closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-due-inactivity refactor stale The issue/PR was marked as stale because it has been open for too long. 🟡 PR review label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants