Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教一下,有部分没看懂 #2

Open
Thoye opened this issue Jun 11, 2019 · 6 comments
Open

请教一下,有部分没看懂 #2

Thoye opened this issue Jun 11, 2019 · 6 comments

Comments

@Thoye
Copy link

Thoye commented Jun 11, 2019

请问extract.cpp里面的word2vec.txt,bags_train.txt,bags_test.txt 数据集里怎么没有这些文件? 还有我要换数据集的话,是不是还得处理成train.txt里面的格式?代码里好像没有处理成train.txt格式的过程。谢谢解答!

@ZhixiuYe
Copy link
Owner

先解压NYT_data.zip文件 unzip NYT_data/NYT_data.zip

@Thoye
Copy link
Author

Thoye commented Jun 11, 2019

嗯嗯,是的解压了,就是自己 bash precess.sh 的时候,报错了,帮忙看看,谢谢!
Init Begin.
wordTotal= 114042
Word dimension= 50
preprocess.sh: line 2: 12694 Segmentation fault (core dumped) ./extract
bags_train.txt
Traceback (most recent call last):
File "data2pkl.py", line 83, in
data2pickle('bags_train.txt','train_temp.pkl',1)
File "data2pkl.py", line 75, in data2pickle
data = readData(input, mode)
File "data2pkl.py", line 20, in readData
f = codecs.open(filename, 'r')
File "/home/mrc/anaconda3/envs/env3/lib/python3.6/codecs.py", line 895, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: 'bags_train.txt'
load test and train raw data...
Traceback (most recent call last):
File "pickledata.py", line 236, in
testData = pickle.load(open('test_temp.pkl', 'rb'), encoding='utf-8')
FileNotFoundError: [Errno 2] No such file or directory: 'test_temp.pkl'
rm: cannot remove 'temp': No such file or directory

@Thoye
Copy link
Author

Thoye commented Jun 11, 2019

extract.cpp中267行,fout.open(("word2id.txt"),ios::out);但是也没有这个word2id.txt文件。

@ZhixiuYe
Copy link
Owner

  1. 确定文件结构是下面这样的
Intra-Bag-and-Inter-Bag-Attentions
|-- figure
    |-- CNNmethods.pdf
    |-- PCNNmethods.pdf
|-- model
    |-- embedding.py
    |-- model_bagatt.py
    |-- pcnn.py
|-- NYT_data
    |-- relation2id.txt
    |-- test.txt
    |-- train.txt
    |-- vec.bin
|-- preprocess
    |-- data2pkl.py
    |-- extract.cpp
    |-- pickledata.py
    |-- preprocess.sh
|-- plot.py
|-- README.md
|-- train.py
  1. 在preprocess文件夹下执行的bash preprocess.sh
    cd preprocess; bash preprocess.sh; cd ..

@WoodPecker1111
Copy link

你可以单独运行extract.cpp文件获得那些pkl文件,再手动运行data2pkl和pkl2data就可以了

@Fengzhu-xu
Copy link

你可以单独运行extract.cpp文件获得那些pkl文件,再手动运行data2pkl和pkl2data就可以了

请问怎么单独运行呢,用什么?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants