-
Notifications
You must be signed in to change notification settings - Fork 13.3k
syntax: fixed ICEs and incorrect line nums when reporting Spans at the end of the file. #12670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can you amend the title and commit message to be a little more specific in the first phrase? "converting Spans to stuff" makes it difficult to know what's being fixed. Also, cc @nick29581 |
Is this only ever an issue when a file ends without a newline? (It has been bothering me for a while that we require a newline, so thanks for looking into it). I am pretty tired and will take a proper look tomorrow. In the mean time, could we avoid all this by just adding a newline character to all filemaps that don't end with one? Or some similar but nicer hack? |
@alexcrichton: done. |
@nick29581: The specific issue that motivated this fix would have been fixed by adding a newline character to every filemap without one, and indeed this was my original idea for the fix too. Dismissed it as being a hack, however, as this will still leave span_to_* slightly broken in that they'll still treat Spans in a way different from what lexer/parser think and will always try to look up 1 byte further than needed when looking up line nums and char positions. This might cause some issue with unicode in the future (it doesn't seem to cause them now however, and i can't at the moment think of unicode example which will break with current code). My second idea for a nicer hack was to just subtract BytePos(1) from sp.hi when looking up positions (so that look up lands into the byte range of given Span instead of one byte after it), but that had problems with multi-byte characters and caused unreasonable test breakages. The current fix is the nicest i could come up with so far, and it isn't too bad in my opinion (obviously :) ). It doesn't change signatures of any existing public methods for codemap (most of changes are in private methods), only changes the behaviour of span_to_* (to look up up to sp.hi index, rather than at sp.hi index). It can also be considered a proper fix and not a hack as it makes span_to_* treat Spans the same way lexer and parser do, which can help preventing any issues that might arise from the disconnect between what lexer/parser think of spans and what codemap does. |
We discussed this a bit. You are quite right that your second idea won't work due to MBCs. I think we still prefer the first 'hack' to this solution though. I don't understand your point "this will still leave span_to_* slightly broken in that they'll still treat Spans in a way different from what lexer/parser think" - shouldn't we always have half-open intervals? In general, I think functions which deal with a 'point' shouldn't know about or deal with open vs closed ranges and all functions which deal with ranges should assume half-open ranges (unless I'm missing a use case). |
On 6 Mar 2014, at 5:17, Nick Cameron wrote:
Have now amended PR description and commit to implement the fix with |
assert_eq!(cp4, CharPos(15)); | ||
} | ||
|
||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add comments stating what these tests are testing please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. Have also removed two tests (from the 5 added by me earlier) - they've been redundant, basically testing what t6 does (i.e. multibyte chars).
I fear this will not work - the lexer will call |
When
So, unless i'm missing something, padding with newlines this way won't result in incorrect offsets for newlines. I've also checked that newline offsets are correct manually, just to be sure: https://gist.github.com/dmski/9418492 (which outputs https://gist.github.com/dmski/9420111 for me). |
So, I think this is OK then. BUT you must add a comment to Also, I think in the tests, you change the |
@nick29581 does this look good to you? I'd be willing to merge this as-is, but you have more experience in this area than I do. |
…e end of the file. CodeMap.span_to_* perform a lookup of a BytePos(sp.hi), which lands into the next filemap if the last byte of range denoted by Span is also the last byte of the filemap, which results in ICEs or incorrect error reports. Example: ```` pub fn main() { let mut num = 3; let refe = &mut num; *refe = 5; println!("{}", num); }```` (note the empty line in the beginning and the absence of newline at the end) The above would have caused ICE when trying to report where "refe" borrow ends. The above without an empty line in the beginning would have reported borrow end to be the first line. Most probably, this is also responsible for (at least some occurrences of) issue rust-lang#8256. The issue is fixed by always adding a newline at the end of non-empty filemaps in case there isn't a new line there already.
Have now added a warning on FileMap.next_line. On 8 Mar 2014, at 3:40, Nick Cameron wrote:
|
r+ @dmski aha! I had missed that you did not add a newline to empty filemaps. I agree that that is the right thing to do. Thanks for iterating over the patch with me, I appreciate the effort and I'm glad this is fixed. @alexcrichton could you give this your magic r+ please? I fear my r+ is not enough for Bors. |
CodeMap.span_to_* perform a lookup of a BytePos(sp.hi), which lands into the next filemap if the last byte of range denoted by Span is also the last byte of the filemap, which results in ICEs or incorrect error reports. Example: ```` pub fn main() { let mut num = 3; let refe = &mut num; *refe = 5; println!("{}", num); }```` (note the empty line in the beginning and the absence of newline at the end) The above would have caused ICE when trying to report where "refe" borrow ends. The above without an empty line in the beginning would have reported borrow end to be the first line. Most probably, this is also responsible for (at least some occurrences of) issue #8256. The issue is fixed by always adding a newline at the end of non-empty filemaps in case there isn't a new line there already.
fix: Fix attribute macros on assoc items being discarded with disabled proc macros Fixes rust-lang/rust-analyzer#12669
…is uncertain (rust-lang#13889) fixes rust-lang#12670 Continuation of rust-lang#12688. r? @Jarcho if you don't mind? changelog: [`manual_unwrap_or_default`] fix wrong suggestions when condition type is uncertain
CodeMap.span_to_* perform a lookup of a BytePos(sp.hi), which lands into the next filemap if the last byte of range denoted by Span is also the last byte of the filemap, which results in ICEs or incorrect error reports.
(note the empty line in the beginning and the absence of newline at the end)
The above would have caused ICE when trying to report where "refe" borrow ends.
The above without an empty line in the beginning would have reported borrow end to be the first line.
Most probably, this is also responsible for (at least some occurrences of) issue #8256.
The issue is fixed by always adding a newline at the end of non-empty filemaps in case there isn't a new line there already.