-
Notifications
You must be signed in to change notification settings - Fork 16
Update parser as per latest changes in nom #132
base: master
Are you sure you want to change the base?
Conversation
|
Thanks for the PR! Do you have an idea why tests are failing? |
|
I will take a look. I did |
source/tokens.rs
Outdated
| { | ||
| match (0..self.slice.len()).find(|b| predicate(self.slice[*b])) { | ||
| Some(0) => Err(Err::Error(Context::Code(*self, e))), | ||
| Some(i) => Ok((Span { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hywan So I found one of the many unknown reasons why the tests are failing.
One problem is that the new Span is not passing the correct offset, line, column values - after the split. I am looking for examples where, after the slice is converted into utf8, the values are computed using a function or something. There are tabs and newslines in the input, it should return the correct offset and line after converting it into utf8. Do you know any such example?
For instance:
fn case_first_with_whitespace() {
named!(hello<Span, Span>, tag!(b"hello"));
named!(test1<Span, Span>, first!(tag!(b"hello")));
named!(test2<Span, Span>, first!(hello));
let input = Span::new(b" \nhello\t\r");
let output = Ok((Span::new_at(b"\t\r", 8, 2, 6), Span::new_at(b"hello", 3, 2, 1)));
assert_eq!(test1(input), output);
assert_eq!(test2(input), output);
}Fails with out output:
Diff < left / right > :
Ok(
(
Span {
< offset: 0,
< line: 1,
< column: 1,
> offset: 8,
> line: 2,
> column: 6,
slice: [
9,
13
]
},
Span {
< offset: 0,
< line: 1,
> offset: 3,
> line: 2,
column: 1,
slice: [
104,
101,
108,
108,
111
]
}
)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When computing the new Span, you should update the offset to include i (from find). Same for line and column.
You might want to check
Lines 1180 to 1222 in 077aa7a
| fn slice(&self, range: $range) -> Self { | |
| let next_slice = &self.slice[range]; | |
| if next_slice == self.slice { | |
| return *self; | |
| } | |
| let next_offset = self.slice.offset(next_slice); | |
| if next_offset == 0 { | |
| return Span { | |
| offset: self.offset, | |
| line : self.line, | |
| column: self.column, | |
| slice : next_slice | |
| }; | |
| } | |
| let consumed = &self.slice[..next_offset]; | |
| let number_of_newlines = bytecount::count(consumed, b'\n') as u32; | |
| let next_column = | |
| if number_of_newlines == 0 { | |
| self.column + next_offset as u32 | |
| } else { | |
| match memchr::memrchr(b'\n', consumed) { | |
| Some(last_newline_position) => { | |
| (next_offset - last_newline_position) as u32 | |
| }, | |
| None => { | |
| unreachable!(); | |
| } | |
| } | |
| }; | |
| Span { | |
| offset: self.offset + next_offset, | |
| line : self.line + number_of_newlines, | |
| column: next_column, | |
| slice : next_slice | |
| } | |
| } |
Ok((self.slice(i..), self.slice(..i))Because slice will return a Span with appropriated values set.
|
I looked into the remaining failing tests. One of them is The test output: Do you think that they need to be implemented differently inside the parser library? @Hywan |
fn case_exclude_empty_set() {
named!(
test,
exclude!(
is_a!("abcdef"),
alt!(
tag!("abc")
| tag!("ace")
)
)
);
assert_eq!(test(&b"fedabc"[..]), Ok((&b""[..], &b"fedabc"[..])));
}
|
|
Please, give me some days to look at this. I don't time right now, hope to get free time in few days :-). Thanks for your patience! |
|
Meanwhile I can do more investigation. #132 (comment) - I am wrong in this comment. It is alright there. The But I am confused here because in every iteration it is |
|
Working on this right now. |
|
@Hywan Did you check this? |
|
Yes, and I have similar issue. I've paused my patch for some weeks because of other projects. I'm planning to switch back to Tagua VM very soon. |
No description provided.