Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC,ELF] Emit warning if a string constant contains newline char #98060

Merged
merged 17 commits into from
Jul 16, 2024

Conversation

chestnykh
Copy link
Contributor

GAS emits warning about newline in the string constant so make the same behaviour.

@llvmbot llvmbot added the mc Machine (object) code label Jul 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 8, 2024

@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-webassembly
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-mc

Author: Dmitriy Chestnykh (chestnykh)

Changes

GAS emits warning about newline in the string constant so make the same behaviour.


Full diff: https://github.com/llvm/llvm-project/pull/98060.diff

2 Files Affected:

  • (modified) llvm/lib/MC/MCParser/AsmLexer.cpp (+6-1)
  • (added) llvm/test/MC/ELF/warn-newline-in-string-constant.s (+6)
diff --git a/llvm/lib/MC/MCParser/AsmLexer.cpp b/llvm/lib/MC/MCParser/AsmLexer.cpp
index e08404ae0ad92f..eda7aedd9f5d3d 100644
--- a/llvm/lib/MC/MCParser/AsmLexer.cpp
+++ b/llvm/lib/MC/MCParser/AsmLexer.cpp
@@ -18,6 +18,7 @@
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCParser/MCAsmLexer.h"
+#include "llvm/Support/raw_ostream.h"
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/SMLoc.h"
 #include "llvm/Support/SaveAndRestore.h"
@@ -646,13 +647,17 @@ AsmToken AsmLexer::LexQuote() {
     return AsmToken(AsmToken::String, StringRef(TokStart, CurPtr - TokStart));
   }
 
-  // TODO: does gas allow multiline string constants?
+  // gas doesn't allow multiline string constants
+  // and emits a warning if a string constant contains newline character.
   while (CurChar != '"') {
     if (CurChar == '\\') {
       // Allow \", etc.
       CurChar = getNextChar();
     }
 
+    if (CurChar == '\n')
+      outs() << "Warning: unterminated string; newline inserted\n";
+
     if (CurChar == EOF)
       return ReturnError(TokStart, "unterminated string constant");
 
diff --git a/llvm/test/MC/ELF/warn-newline-in-string-constant.s b/llvm/test/MC/ELF/warn-newline-in-string-constant.s
new file mode 100644
index 00000000000000..e126db30ee47a4
--- /dev/null
+++ b/llvm/test/MC/ELF/warn-newline-in-string-constant.s
@@ -0,0 +1,6 @@
+// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o %t | FileCheck %s
+
+.string "abcdefg
+12345678"
+
+// CHECK: Warning: unterminated string; newline inserted

Copy link

github-actions bot commented Jul 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@chestnykh chestnykh force-pushed the warn-newline-in-string-constant branch 2 times, most recently from 58dfd35 to af41795 Compare July 8, 2024 18:11
@chestnykh chestnykh changed the title Emit warning if a string constant contains newline char. [MC,ELF] Emit warning if a string constant contains newline char. Jul 8, 2024
Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Printing text to stdout is not the correct way to report warnings, it should be done via source manager (and it is likely there are helper functions).
  2. AsmLexer is a lexer, not parser. It does not know the context the lexed string is used. In some contexts the warning may not be desired. The warning, if needed, should be emitted by a parser that knows full context, e.g. in AsmParser::parseDirectiveAscii.

@chestnykh
Copy link
Contributor Author

  1. Printing text to stdout is not the correct way to report warnings, it should be done via source manager (and it is likely there are helper functions).

    1. AsmLexer is a lexer, not parser. It does not know the context the lexed string is used. In some contexts the warning may not be desired. The warning, if needed, should be emitted by a parser that knows full context, e.g. in AsmParser::parseDirectiveAscii.

I reworked

llvm/lib/MC/MCParser/AsmLexer.cpp Outdated Show resolved Hide resolved
llvm/test/MC/ELF/warn-newline-in-string-constant.s Outdated Show resolved Hide resolved
llvm/lib/MC/MCParser/AsmParser.cpp Outdated Show resolved Hide resolved
@s-barannikov s-barannikov requested a review from MaskRay July 9, 2024 10:38
llvm/lib/MC/MCParser/AsmParser.cpp Outdated Show resolved Hide resolved
llvm/lib/MC/MCParser/DarwinAsmParser.cpp Outdated Show resolved Hide resolved
llvm/test/MC/ELF/warn-newline-in-string-constant.s Outdated Show resolved Hide resolved
llvm/lib/MC/MCParser/AsmParser.cpp Outdated Show resolved Hide resolved
llvm/test/MC/ELF/warn-newline-in-string-constant.s Outdated Show resolved Hide resolved
@s-barannikov s-barannikov self-requested a review July 9, 2024 14:21
@@ -3125,6 +3130,7 @@ bool AsmParser::parseDirectiveAscii(StringRef IDVal, bool ZeroTerminated) {
do {
if (parseEscapedString(Data))
return true;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert unneeded change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


"

// CHECK-WARN: warn-newline-in-escaped-string.s:3:21: warning: unterminated string; newline inserted
Copy link
Member

@MaskRay MaskRay Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of string/string/string check/check/check, consider check/string check/string check/string that new tests use more.

You can use the following to test relative line numbers, so that inserting lines in the future would not require test update. We often omit the filename. When the filename is useful, use FileCheck %s -DFILE=%s

[[#@LINE+1]]:21: warning: 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, reworked

// CHECK-WARN: [[#@LINE-8]]:28: warning: unterminated string; newline inserted
// CHECK-WARN: .asciz "another test string
// CHECK-WARN: [[#@LINE-9]]:1: warning: unterminated string; newline inserted
// CHECK-WARN: ^
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete ^ checks. With column numbers on the above line, ^ check isn't useful.

Use -NEXT whenever applicable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// CHECK-WARN: [[#@LINE-2]]:20: warning: unterminated string; newline inserted

.ascii "test\nstring\xFF\n\n\xFF"
// CHECK-WARN-NOT: [[#@LINE-1]]{{.*}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use FileCheck --implicit-check-not=warning: and remove these -NOT: patterns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@chestnykh chestnykh requested a review from MaskRay July 15, 2024 04:16
@@ -0,0 +1,50 @@
// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s 2>&1 -o /dev/null | FileCheck -DFILE=%s --strict-whitespace %s --implicit-check-not=valid1_string --implicit-check-not=valid2_string --implicit-check-not=valid3_string --check-prefix=CHECK-WARN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use x86_64 as the triple to emphasize that the behavior is for all ELF (actually other object file formats as well, but we don't bother testing), not just Linux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I don't have write access so please merge this PR when it is ready

@s-barannikov s-barannikov self-requested a review July 15, 2024 07:26
Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

GAS emits warning about newline in the string constant
so make the same behaviour.
Emit warning about newline characters in strings
for `.string`, '.ascii' and '.asciz' directives
like GAS.
Move the point of check into `parseEscapedString`.
Check if the `Warning()` returned true to stop compilation.
- Remove `WarnNewline` arg
- Emit warning to `\n` symbol unconditionally
- Revert unneeded change
- Refactor test to use string/check style and
[[#@line]] instead of hardcoded line numbers
- Use `CHECK-NEXT`
- Use `implicit-check-not` instead of `CHECK-WARN-NOT`
- Add one `--implicit-check-not` with the appropriate pattern
- Refactor checks
- Remove redundant `--check-prefix`
- Remove `-DFILE=`
@s-barannikov s-barannikov force-pushed the warn-newline-in-string-constant branch from c584425 to 2f517c3 Compare July 15, 2024 19:33
@@ -0,0 +1,55 @@
// RUN: llvm-mc -filetype=obj -triple x86_64 %s 2>&1 -o /dev/null \
// RUN: | FileCheck %s --implicit-check-not="{{[0-9]+:[0-9]+: warning: unterminated string}}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually just test --implicit-check-not=warning:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@MaskRay
Copy link
Member

MaskRay commented Jul 15, 2024

subject

[MC,ELF] Emit warning if a string constant contains newline char.

The convention is to omit the trailing period in the subject

@s-barannikov
Copy link
Contributor

I've rebased the branch to check if that would resolve CI failure (it looked unrelated).

@chestnykh chestnykh changed the title [MC,ELF] Emit warning if a string constant contains newline char. [MC,ELF] Emit warning if a string constant contains newline char Jul 16, 2024
@@ -0,0 +1,55 @@
// RUN: llvm-mc -filetype=obj -triple x86_64 %s 2>&1 -o /dev/null \
// RUN: | FileCheck %s --implicit-check-not=warning:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

continuation lines should be indented by 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@s-barannikov s-barannikov merged commit bed625b into llvm:main Jul 16, 2024
7 checks passed
sgundapa pushed a commit to sgundapa/upstream_effort that referenced this pull request Jul 23, 2024
…m#98060)

GAS emits warning about newline in the string constant so make the same
behaviour.
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
)

Summary:
GAS emits warning about newline in the string constant so make the same
behaviour.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250926
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants