Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(rust) quotes around a char emoji are not colored properly #3933

Closed
tomsarry opened this issue Nov 28, 2023 · 4 comments · Fixed by #4156
Closed

(rust) quotes around a char emoji are not colored properly #3933

tomsarry opened this issue Nov 28, 2023 · 4 comments · Fixed by #4156
Labels
bug help welcome Could use help from community language

Comments

@tomsarry
Copy link

Describe the issue
In rust, single quotes around an emoji are not colored properly.
However, autodetect (on csharp) successfully colors it.

Which language seems to have the issue?
Rust

Are you using highlight or highlightAuto?
highlight

Sample Code to Reproduce
https://jsfiddle.net/cawyx173/

rust, coloring doesn't work
image

autodetect (csharp), works
image

Expected behavior
When using rust highlighting, single quotes around an emoji should have the same color as single quotes around any other character.

Additional context
Syntax highlighting works properly using double quotes around emojis.
Problem seen in rust book.

@tomsarry tomsarry added bug help welcome Could use help from community language labels Nov 28, 2023
@joshgoebel
Copy link
Member

 {
        className: 'string',
        variants: [
          { begin: /b?r(#*)"(.|\n)*?"\1(?!#)/ },
          { begin: /b?'\\?(x\w{2}|u\w{4}|U\w{8}|.)'/ }
        ]
      },

I'm guessing . doesn't cover Emoji... I'd have to play around with this one...

@tomsarry
Copy link
Author

Would using |\p{Extended_Pictographic} be acceptable?
Quick checks seem to make it work.

https://regexr.com/7o2cc

@tomsarry
Copy link
Author

tomsarry commented Nov 29, 2023

I talked a bit too fast, the pattern above does not work either (I think ✨ is encoded using only one codepoint, the problem seems to be for characters encoded with 2 codepoints).
After looking at the char implementation, the following are also valid but not matched by the expression above:

  • '\u{10ffff}'
  • '\u{FFFD}'
  • 🥰 (actually, most unicode emojis seem to fail)
  • '𝕊'

I really am not a regex expert, but I found the following matches for emojis / unicode characters, this might be of some help:

@joshgoebel
Copy link
Member

Want to see if my fix works?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help welcome Could use help from community language
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants