-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Raw Identifiers #2151
RFC: Raw Identifiers #2151
Conversation
Add a raw identifier format `r#ident`, so crates written in future language epochs/versions can still use an older API that overlaps with new keywords.
Generally I'm in support of the RFC. However I think that the feature should only be available through a whitelist, where its actually useful. So only enable it for the newly introduced keywords like
In fact in the VLA RFC we were wondering how to get |
To clarify: this allows using as an identifier what would otherwise be an identifier, but does not change the set of characters allows in identifiers, right? If so, that sounds fine. |
I prefer generality myself. I could see having a lint for "unnecessarily raw identifier", but I see no reason to forbid this.
Correct. Some of the discussed alternatives could allow extended characters, but that's not what I'm proposing. If some people do want extended characters, then we might want to choose a syntax that would allow that, even if we don't extend it initially. |
I dismissed the |
I like not extending the identifier alphabet here.
I worry that such a restriction would make it harder to write code that compiles on multiple compiler versions. I want to be able to update my code to avoid a new-epoch keyword while still being able to compile it with the current stable that doesn't know about that keyword yet. |
Epochs work differently. Any future compiler version will support the epoch of your code, that's what the epochs RFC guarantees. So if you say that your codebase uses the old epoch, you can freely use the identifier, and you are compatible with all future compilers. This will be even enforced in macros (macros will get epoch hygiene)! If you say that your codebase uses the new epoch, your crate can obviously only be compiled by compiler versions that support that epoch, this has nothing to do with the whitelist. But if you opt in to the new epoch, the whitelisted keywords will be available to you. The only thing that a whitelist will make harder is wanting to be able to "support" multiple epochs, but this isn't really a legitimate real-world case IMO because your code will always be in exactly one epoch as you must explictly specify it (except for the 2015 epoch which is the default). There is one use case where badly deployed whitelists would be an issue: when you are migrating code from one epoch to another, and you are not doing it by invoking rustfix (despite rustfix being required to work with almost all code), it would show up as error. This use case can very easily be fixed though, simply by extending the whitelist in the old epoch as well. |
I agree it's rare, but I don't think it deserves to be blocking. I'd be tempted to use I do agree that a "unnecessary raw identifier" warning or clippy lint makes sense. |
It doesn't seem that like RFC has a lot of traction. Backslashes are intuitive as "escape" characters. I feel just Seeing a letter prefix like |
It's meant to seem more like raw strings, e.g. |
This RFC tries to solve a problem that doesn't exist and won't exist is epochs are done in responsible way.
|
There is also a minor technical issue with raw identifiers - some logic in the compiler relies on keywords being unusable as item names. |
@petrochenkov 's argument that standard library additions mean a similar amount of breakage has convinced me that this feature is not required. I think its better off to just simply change the identifiers to not use keywords again, maybe forcing an API bump. |
You'd import this old API via |
AFAICS,
That's ok for free items, but you can't import associated items like methods this way. Maybe that can still use a UFCS form -- in the baseball example, you'd write If new keywords are always considered identifiers in the context of paths ( |
We're only worried about I think perhaps the best solution might be prefixing each usage by an attribute I doubt struct fields would be too problematic in practice, but methods could maybe be renamed with local inherent
I suppose |
I could see limiting this to only reserved words, but limiting to only those reserved words which were introduced in an epoch seems unnecessary & potentially confusing for users who encounter this feature and don't know when each keyword was introduced. In general, we have taken a very free hand with the syntax and use lints, social conventions and rustfmt to keep everyone on the same page, and I don't see a reason to do things differently here. This seems like a straightforward solution to a basic problem to me. |
One more alternative: C# allows bare Unicode escapes as part of identifier. (Very ugly, not recommending it, but still an alternative.) class Class1
{
static void M() {
cl\u0061ss.st\u0061tic(true);
}
} (This "feature" is probably inspired by Java, but you can't define a keyword-identifier like this in Java.) |
Not necessarily an alternative, but Dart uses |
OK, I noted Dart, but it looks like |
@cuviper not just that I think people also wonder whether to use them in macros 2.0 for escaping hygiene. |
@cuviper Hmm, so these are the official docs - but they don't mention |
This proposal reminds me of C/C++ trigraphs which are on their way out with C++17. I'm sure like trigraphs this feature will be used more by people who want to write confusing code than for its actually intended purpose... Also, I don't think that it will be of any good if cargo and the rust compiler now switch to using Do you really have to modify the language and add a whole new way of referring to identifiers just because you are scared of implementing analysis of which identifiers are still free in |
Come on,
The RFC explicitly recommends using alternatives like
I don't see how This is just a means towards keeping Rust's overall compatibility goals. |
This doesn't remind me of trigraphs in the slightest. Those are there for character sets without symbols used by the language, or for people who cannot type them. I agree we don't have that need. Instead it reminds me of new { style="max-width: 66ex", @class = "textcontent" } It could tell them to use |
The RFC as proposed does not change which characters can be used in an identifier. It only allows having identifiers that would otherwise be keywords. I’m not for or against this proposal. I would be opposed to raw identifier allowing arbitrary characters. CSS does this, and it’s just nonsense. For example you can have a CSS custom property whose name is literally the ASCII space. Serde already has |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
I didn't get around to adding an alternative about just renaming/aliasing. Is that still wanted? |
Is the macro like syntax |
@burdges AFAIK macros don't work in ident positions, which is why |
Re: syntax, just go with Re: extending this feature to putting arbitrary Unicode in identifiers: don't. That's a subject for its own RFC and its own bikeshed. Be maximally conservative here. |
|
@est31 Though there do exist features that ought to be made syntactically ugly in order to discourage their use, this isn't one of them. There is nothing dangerous whatsoever about this feature, and nothing useful about it that risks overuse (or any use at all) except in unfortunate circumstances that will require users to use it. Any argument that people will deliberately use this to obfuscate their code is even more damning of |
Maybe this is just me, but if I had no idea Rust had a raw identifiers feature, and I saw I don't object to the notion that |
If this feature has to happen, I'd rather use |
Have you even read the epochs RFC? Code under the old epoch will always compile. If you switch epochs, this can already be seen by some as a semver-breaking change as most likely you are switching the minimum supported rustc version (it disrupts anyone stuck on an old compiler, so it is a breaking change!). So do it properly and just replace all the idents you have with proper new names for them. |
I think I still personally like both Aside from Also, we'll need to |
IMO we should be using |
Linking #1579 with respect to using |
The final comment period is now complete. |
If there isn't, can there be a page for why words are reserved, and why these reserved words cannot be contextual? |
Huzzah! The RFC is merged! Tracking issue: rust-lang/rust#48589 |
I don't know how I missed both the initial announcement of this and the FCP call, but I just have to share a perspective on the intuitiveness of As someone whose experience is more or less exclusively in imperative languages, whenever I see By contrast, |
Add a raw identifier format
r#ident
, so crates written in futurelanguage epochs/versions can still use an older API that overlaps with
new keywords.
(rendered)