-
Notifications
You must be signed in to change notification settings - Fork 15
Update parser as per latest changes in nom #132
base: master
Are you sure you want to change the base?
Conversation
Thanks for the PR! Do you have an idea why tests are failing? |
I will take a look. I did |
source/tokens.rs
Outdated
{ | ||
match (0..self.slice.len()).find(|b| predicate(self.slice[*b])) { | ||
Some(0) => Err(Err::Error(Context::Code(*self, e))), | ||
Some(i) => Ok((Span { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hywan So I found one of the many unknown reasons why the tests are failing.
One problem is that the new Span
is not passing the correct offset
, line
, column
values - after the split. I am looking for examples where, after the slice
is converted into utf8
, the values are computed using a function or something. There are tabs and newslines in the input, it should return the correct offset
and line
after converting it into utf8
. Do you know any such example?
For instance:
fn case_first_with_whitespace() {
named!(hello<Span, Span>, tag!(b"hello"));
named!(test1<Span, Span>, first!(tag!(b"hello")));
named!(test2<Span, Span>, first!(hello));
let input = Span::new(b" \nhello\t\r");
let output = Ok((Span::new_at(b"\t\r", 8, 2, 6), Span::new_at(b"hello", 3, 2, 1)));
assert_eq!(test1(input), output);
assert_eq!(test2(input), output);
}
Fails with out output:
Diff < left / right > :
Ok(
(
Span {
< offset: 0,
< line: 1,
< column: 1,
> offset: 8,
> line: 2,
> column: 6,
slice: [
9,
13
]
},
Span {
< offset: 0,
< line: 1,
> offset: 3,
> line: 2,
column: 1,
slice: [
104,
101,
108,
108,
111
]
}
)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When computing the new Span
, you should update the offset
to include i
(from find
). Same for line
and column
.
You might want to check
Lines 1180 to 1222 in 077aa7a
fn slice(&self, range: $range) -> Self { | |
let next_slice = &self.slice[range]; | |
if next_slice == self.slice { | |
return *self; | |
} | |
let next_offset = self.slice.offset(next_slice); | |
if next_offset == 0 { | |
return Span { | |
offset: self.offset, | |
line : self.line, | |
column: self.column, | |
slice : next_slice | |
}; | |
} | |
let consumed = &self.slice[..next_offset]; | |
let number_of_newlines = bytecount::count(consumed, b'\n') as u32; | |
let next_column = | |
if number_of_newlines == 0 { | |
self.column + next_offset as u32 | |
} else { | |
match memchr::memrchr(b'\n', consumed) { | |
Some(last_newline_position) => { | |
(next_offset - last_newline_position) as u32 | |
}, | |
None => { | |
unreachable!(); | |
} | |
} | |
}; | |
Span { | |
offset: self.offset + next_offset, | |
line : self.line + number_of_newlines, | |
column: next_column, | |
slice : next_slice | |
} | |
} |
Ok((self.slice(i..), self.slice(..i))
Because slice
will return a Span
with appropriated values set.
I looked into the remaining failing tests. One of them is The test output:
Do you think that they need to be implemented differently inside the parser library? @Hywan |
fn case_exclude_empty_set() {
named!(
test,
exclude!(
is_a!("abcdef"),
alt!(
tag!("abc")
| tag!("ace")
)
)
);
assert_eq!(test(&b"fedabc"[..]), Ok((&b""[..], &b"fedabc"[..])));
}
|
Please, give me some days to look at this. I don't time right now, hope to get free time in few days :-). Thanks for your patience! |
Meanwhile I can do more investigation. #132 (comment) - I am wrong in this comment. It is alright there. The But I am confused here because in every iteration it is |
Working on this right now. |
@Hywan Did you check this? |
Yes, and I have similar issue. I've paused my patch for some weeks because of other projects. I'm planning to switch back to Tagua VM very soon. |
No description provided.