Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

Update parser as per latest changes in nom #132

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

subhojit777
Copy link

No description provided.

@Hywan
Copy link
Member

Hywan commented Sep 21, 2018

Thanks for the PR!

Do you have an idea why tests are failing?

@subhojit777
Copy link
Author

I will take a look. I did cargo build, and I came to know to know that InputTake and InputTakeAtPosition are not implemented. I will look into the tests.

source/tokens.rs Outdated
{
match (0..self.slice.len()).find(|b| predicate(self.slice[*b])) {
Some(0) => Err(Err::Error(Context::Code(*self, e))),
Some(i) => Ok((Span {
Copy link
Author

@subhojit777 subhojit777 Sep 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hywan So I found one of the many unknown reasons why the tests are failing.

One problem is that the new Span is not passing the correct offset, line, column values - after the split. I am looking for examples where, after the slice is converted into utf8, the values are computed using a function or something. There are tabs and newslines in the input, it should return the correct offset and line after converting it into utf8. Do you know any such example?

For instance:

    fn case_first_with_whitespace() {
        named!(hello<Span, Span>, tag!(b"hello"));
        named!(test1<Span, Span>, first!(tag!(b"hello")));
        named!(test2<Span, Span>, first!(hello));

        let input  = Span::new(b"  \nhello\t\r");
        let output = Ok((Span::new_at(b"\t\r", 8, 2, 6), Span::new_at(b"hello", 3, 2, 1)));

        assert_eq!(test1(input), output);
        assert_eq!(test2(input), output);
    }

Fails with out output:

Diff < left / right > :                                                                                                                                                     
 Ok(                                                                                                                                                                        
     (                                                                                                                                                                      
         Span {                                                                                                                                                             
<            offset: 0,                                                                                                                                                     
<            line: 1,                                                                                                                                                       
<            column: 1,                                                                                                                                                     
>            offset: 8,                                                                                                                                                     
>            line: 2,                                                                                                                                                       
>            column: 6,                                                                                                                                                     
             slice: [                                                                                                                                                       
                 9,                                                                                                                                                         
                 13                                                                                                                                                         
             ]                                                                                                                                                              
         },                                                                                                                                                                 
         Span {                                                                                                                                                             
<            offset: 0,                                                                                                                                                     
<            line: 1,                                                                                                                                                       
>            offset: 3,                                                                                                                                                     
>            line: 2,                                                                                                                                                       
             column: 1,                                                                                                                                                     
             slice: [                                                                                                                                                       
                 104,                                                                                                                                                       
                 101,                                                                                                                                                       
                 108,                                                                                                                                                       
                 108,                                                                                                                                                       
                 111                                                                                                                                                        
             ]                                                                                                                                                              
         }                                                                                                                                                                  
     )                                                                                                                                                                      
 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When computing the new Span, you should update the offset to include i (from find). Same for line and column.

You might want to check

parser/source/tokens.rs

Lines 1180 to 1222 in 077aa7a

fn slice(&self, range: $range) -> Self {
let next_slice = &self.slice[range];
if next_slice == self.slice {
return *self;
}
let next_offset = self.slice.offset(next_slice);
if next_offset == 0 {
return Span {
offset: self.offset,
line : self.line,
column: self.column,
slice : next_slice
};
}
let consumed = &self.slice[..next_offset];
let number_of_newlines = bytecount::count(consumed, b'\n') as u32;
let next_column =
if number_of_newlines == 0 {
self.column + next_offset as u32
} else {
match memchr::memrchr(b'\n', consumed) {
Some(last_newline_position) => {
(next_offset - last_newline_position) as u32
},
None => {
unreachable!();
}
}
};
Span {
offset: self.offset + next_offset,
line : self.line + number_of_newlines,
column: next_column,
slice : next_slice
}
}
to see an example of that logic. One way to do that would be (I imagine):

Ok((self.slice(i..), self.slice(..i))

Because slice will return a Span with appropriated values set.

@subhojit777
Copy link
Author

I looked into the remaining failing tests. One of them is case_exclude_empty_set(). I found that it is failing because it uses the InputTakeAtPosition and FindToken implementations of &[u8] which is provided by the nom framework.

The test output:

Diff < left / right > :
 
<Err(
<    Incomplete(
<        Size(
<            1
<        )
>Ok(
>    (
>        [],
>        [
>            102,
>            101,
>            100,
>            97,
>            98,
>            99
>        ]
     )
 )

Do you think that they need to be implemented differently inside the parser library? @Hywan

@subhojit777
Copy link
Author

    fn case_exclude_empty_set() {
        named!(
            test,
            exclude!(
                is_a!("abcdef"),
                alt!(
                    tag!("abc")
                  | tag!("ace")
                )
            )
        );

        assert_eq!(test(&b"fedabc"[..]), Ok((&b""[..], &b"fedabc"[..])));
    }

is_a!("abcdef") uses find_token() which returns bool. But split_at_position1() does Some() pattern match-ing. This is weird. I think I am missing something here. Any idea?

@Hywan
Copy link
Member

Hywan commented Oct 9, 2018

Please, give me some days to look at this. I don't time right now, hope to get free time in few days :-). Thanks for your patience!

@subhojit777
Copy link
Author

Meanwhile I can do more investigation.

#132 (comment) - I am wrong in this comment. It is alright there. The find() inside split_at_position1() expects the Fn to return a bool

But I am confused here because in every iteration it is true, still it is executing the None arm.

@Hywan
Copy link
Member

Hywan commented Nov 14, 2018

Working on this right now.

@subhojit777
Copy link
Author

@Hywan Did you check this?

@Hywan
Copy link
Member

Hywan commented Dec 5, 2018

Yes, and I have similar issue. I've paused my patch for some weeks because of other projects. I'm planning to switch back to Tagua VM very soon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants