Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple spaces in input without -tokenize #1

Open
GoogleCodeExporter opened this issue Mar 25, 2015 · 0 comments
Open

multiple spaces in input without -tokenize #1

GoogleCodeExporter opened this issue Mar 25, 2015 · 0 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?

Have two spaces or more between words in input

example: echo "a  b" | java -jar berkeleyParser.jar -gr eng_sm5.gr
java.lang.StringIndexOutOfBoundsException: String index out of range: 0
    at java.lang.String.charAt(String.java:687)
    at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.getSignature(Unknown Source)
    at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.getCachedSignature(Unknown
Source)
    at edu.berkeley.nlp.PCFGLA.SophisticatedLexicon.score(Unknown Source)
    at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.initializeChart(Unknown
Source)
    at edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.doPreParses(Unknown
Source)
    at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.getBestConstrainedParse(Unknow
n
Source)
    at
edu.berkeley.nlp.PCFGLA.CoarseToFineMaxRuleParser.getBestConstrainedParse(Unknow
n
Source)
    at edu.berkeley.nlp.PCFGLA.BerkeleyParser.main(BerkeleyParser.java:190)


If there is only one space, one obtains a parse tree.
echo "a b" | java -jar berkeleyParser2.jar -gr eng_sm5.gr 
( (NP (DT a) (X (SYM b))) )

If you run the parser with tokenization (-tokenize), it works fine.

Suggestion: track the line number in the input and show it when printing
the trace. Makes debugging easier.

Original issue reported on code.google.com by benoit.f...@gmail.com on 11 Feb 2009 at 9:15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant