error while running make_datafile.py #16

97yogitha · 2017-10-28T03:25:30Z

@abisee this is the error that I get when I run the command makefile.py cnn/stories dailymail/stories

Preparing to tokenize cnn/stories to cnn_stories_tokenized...
Making list of files to tokenize...
Tokenizing 92579 files in cnn/stories and saving in cnn_stories_tokenized...
Exception in thread "main" java.io.IOException: Stream closed
	at java.io.BufferedWriter.ensureOpen(BufferedWriter.java:116)
	at java.io.BufferedWriter.write(BufferedWriter.java:221)
	at java.io.Writer.write(Writer.java:157)
	at edu.stanford.nlp.process.PTBTokenizer.tokReader(PTBTokenizer.java:505)
	at edu.stanford.nlp.process.PTBTokenizer.tok(PTBTokenizer.java:450)
	at edu.stanford.nlp.process.PTBTokenizer.main(PTBTokenizer.java:813)
Stanford CoreNLP Tokenizer has finished.
Traceback (most recent call last):
  File "make_datafiles.py", line 235, in <module>
    tokenize_stories(cnn_stories_dir, cnn_tokenized_stories_dir)
  File "make_datafiles.py", line 86, in tokenize_stories
    raise Exception("The tokenized stories directory %s contains %i files, but it should contain the same number as %s (which has %i files). Was there an error during tokenization?" % (tokenized_stories_dir, num_tokenized, stories_dir, num_orig))
Exception: The tokenized stories directory cnn_stories_tokenized contains 1 files, but it should contain the same number as cnn/stories (which has 92579 files). Was there an error during `tokenization?`

The text was updated successfully, but these errors were encountered:

JafferWilson · 2017-10-30T03:29:55Z

Please let me know are you using the stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar or the one with 2017? This error mostly occur when you are not using stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar. Please check.

ibarrien · 2017-10-30T03:34:05Z

I had a similar issue, though not sure if it's the same cause. See: #12

…

On Sun, Oct 29, 2017 at 8:29 PM, Jaffer Wilson ***@***.***> wrote: Please let me know are you using the stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar or the one with 2017? This error mostly occur when you are not using stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar. Please check. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#16 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHM9T-dG4T41-xYVGiyZCN2ZD412WrNAks5sxUKzgaJpZM4QJzMu> .

JafferWilson · 2017-10-30T04:45:46Z

I have created already the processed file you can try that without any issue. Here is the link: https://github.com/JafferWilson/Process-Data-of-CNN-DailyMail
Use Python 2.7

97yogitha · 2017-10-30T05:06:02Z

@JafferWilson yes i am using stanford-corenlp-full-2017-09-0/stanford-corenlp-3.8.0.jar. I will use the processed file.

JafferWilson · 2017-10-30T05:28:41Z

@97yogitha No do not use the 2017 one.. use 2016 which is mentioned in the Read.me file of the repository.

IreneZihuiLi · 2017-11-23T20:53:11Z

@JafferWilson Thanks for the help. I used 3.7.0 from https://stanfordnlp.github.io/CoreNLP/history.html and it worked.

Neuqmiao · 2017-12-07T13:34:37Z

thanks very much, today I encountered this problem with the newest version 3.8.0, and then I changed to 3.7.0, finally, it worked.

JafferWilson · 2017-12-08T05:01:50Z

Please some one close this issue.

Sharathnasa · 2018-01-14T13:25:15Z

@JafferWilson Could you help in running the nueral network against our own data, how to generate .bin files for our article?

I have clear idea about tokenozation but what about the urls mapping? How to do it?

dondon2475848 · 2018-03-07T00:41:50Z

Hi @Sharathnasa
You can clone below repository:
https://github.com/dondon2475848/make_datafiles_for_pgn
Run

python make_datafiles.py  ./stories  ./output

It processes your test data into the binary format

ARNABKUMARPAN · 2019-11-11T09:31:45Z

check subprocess.call(command) set classpath using os.environ["CLASSPATH"]='stanford-corenlp-full-2016-10-31/stanford-corenlp-3.7.0.jar', then run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error while running make_datafile.py #16

error while running make_datafile.py #16

97yogitha commented Oct 28, 2017 •

edited

Loading

JafferWilson commented Oct 30, 2017

ibarrien commented Oct 30, 2017 via email

JafferWilson commented Oct 30, 2017

97yogitha commented Oct 30, 2017

JafferWilson commented Oct 30, 2017

IreneZihuiLi commented Nov 23, 2017

Neuqmiao commented Dec 7, 2017

JafferWilson commented Dec 8, 2017

Sharathnasa commented Jan 14, 2018

dondon2475848 commented Mar 7, 2018

ARNABKUMARPAN commented Nov 11, 2019

error while running make_datafile.py #16

error while running make_datafile.py #16

Comments

97yogitha commented Oct 28, 2017 • edited Loading

JafferWilson commented Oct 30, 2017

ibarrien commented Oct 30, 2017 via email

JafferWilson commented Oct 30, 2017

97yogitha commented Oct 30, 2017

JafferWilson commented Oct 30, 2017

IreneZihuiLi commented Nov 23, 2017

Neuqmiao commented Dec 7, 2017

JafferWilson commented Dec 8, 2017

Sharathnasa commented Jan 14, 2018

dondon2475848 commented Mar 7, 2018

ARNABKUMARPAN commented Nov 11, 2019

97yogitha commented Oct 28, 2017 •

edited

Loading