Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format-related string fixes and refactorings #4704

Merged
merged 7 commits into from
Jun 8, 2024
Merged

Conversation

asl
Copy link
Contributor

@asl asl commented Jun 3, 2024

  • Switch to abseil format routines in places where printf-format strings were used
  • Format strings are now checked statically, this uncovered several issues with them
  • Util::printf_format is often misused, error prone, redundant and therefore fully eliminated: one needs to use absl::StrFormat and then do explicit cstring conversion if necessary
  • Switched SourceCodeBuilder to use cord data structure from abseil
  • Some cstring-related refactoring in SourceCode in order to reduce cstrings for transient stuff

Fixes #4689

@asl asl added core Topics concerning the core segments of the compiler (frontend, midend, parser) breaking-change This change may break assumptions of compiler back ends. labels Jun 3, 2024
@asl asl requested review from vlstill, ChrisDodd and fruffy June 3, 2024 19:03
@asl
Copy link
Contributor Author

asl commented Jun 3, 2024

I can probably factor our removal of Utils::printf_format into a separate PR, so it will be a dedicated single entry in the changelog?

@fruffy
Copy link
Collaborator

fruffy commented Jun 3, 2024

How easy is it to switch from absl::StrFormat to std::format btw? Something worth keeping in mind is that with C++20 we may use more standard library functions instead.

@asl
Copy link
Contributor Author

asl commented Jun 3, 2024

How easy is it to switch from absl::StrFormat to std::format btw? Something worth keeping in mind is that with C++20 we may use more standard library functions instead.

Format strings are mess :( abseil uses POSIX (printf-style) format strings with few extensions (so we cannot easily switch from boost::format to it). std::format format strings are python'ish. So, they are all incompatible... Essentially, std::format with plain placeholders is much closer to abseil:StrSubsitute, however, latter does not support any formatting at all, just stringification.

But so far:

  • We do use printf-like formats inside parser (due to bison / yacc wanting them) and for SourceCodeBuilder
  • boost::format for error messages

@fruffy
Copy link
Collaborator

fruffy commented Jun 3, 2024

How easy is it to switch from absl::StrFormat to std::format btw? Something worth keeping in mind is that with C++20 we may use more standard library functions instead.

Format strings are mess :( abseil uses POSIX (printf-style) format strings with few extensions (so we cannot easily switch from boost::format to it). std::format format strings are python'ish. So, they are all incompatible... Essentially, std::format with plain placeholders is much closer to abseil:StrSubsitute, however, latter does not support any formatting at all, just stringification.

But so far:

* We do use printf-like formats inside parser (due to bison / yacc wanting them) and for `SourceCodeBuilder`

* boost::format for error messages

But switching from absl::StrFormat to std::format would be easier? As long as we are assured at that we could remove Util::printf_format. Otherwise, we might preserve it to preserve some API compatibility between the different formatting back ends.

@asl
Copy link
Contributor Author

asl commented Jun 3, 2024

But switching from absl::StrFormat to std::format would be easier?

I am afraid we'd need to keep printf-formatting for quite some time. Putting backends aside that use this, printf-formatting is used in parser that is autogenerated by Bison and is not ready for anything else :)

@fruffy
Copy link
Collaborator

fruffy commented Jun 3, 2024

I am afraid we'd need to keep printf-formatting for quite some time. Putting backends aside that use this, printf-formatting is used in parser that is autogenerated by Bison and is not ready for anything else :)

Oh I am not concerned about printf in particular. More about switching from custom formatters (boost, absl) to the stdlib.

@asl
Copy link
Contributor Author

asl commented Jun 4, 2024

Oh I am not concerned about printf in particular. More about switching from custom formatters (boost, absl) to the stdlib.

I think it is easier to switch from abseil as it is "semantical" formatting by default. E.g. one specifies "%d" and we'd just need to replace it with {:d} for std::format with some regexp or so. And everything more complicated could be resolved manually. Boost is positional by default, so one cannot just replace stuff with placeholders. Or the latter must be more complicated to be positional as well.

@asl asl force-pushed the str-format-improve branch 2 times, most recently from 25ec93a to 778b0c6 Compare June 6, 2024 23:03
Copy link
Collaborator

@fruffy fruffy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, we should replace absl::StrFormat with std::format once we have it available.

@@ -218,6 +218,8 @@ class ErrorReporter {
Util::SourcePosition position = sources->getCurrentPosition();
position--;

// Unfortunately, we cannot go with statically checked format string
// here as it would require some changes to yyerror
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, well, there is some logic there, but essentially the format string is chosen programmatically. And we need it constexpr here.

@asl asl force-pushed the str-format-improve branch from 778b0c6 to c1a966d Compare June 7, 2024 21:56
@asl asl added this pull request to the merge queue Jun 7, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 8, 2024
@asl asl enabled auto-merge June 8, 2024 00:30
@asl asl added this pull request to the merge queue Jun 8, 2024
Merged via the queue into p4lang:main with commit 7124b4e Jun 8, 2024
16 of 17 checks passed
@asl asl deleted the str-format-improve branch June 8, 2024 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change This change may break assumptions of compiler back ends. core Topics concerning the core segments of the compiler (frontend, midend, parser)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix invalid format strings
2 participants