How could I reproduce the result for SQuAD 1.1? #10

alphaf52 · 2019-10-25T05:38:31Z

Hi,

Thanks for your good work. I would like to reproduce the result for SQuAD 1.1 (as shown in Table 1 in the paper), but I am having some troubles.

First, I downloaded the Pretrained Model from "gs://denspi/v1-0/model" and then tried to eval on dev-v1.1 using: "python run_piqa.py --do_predict --output_dir tmp --do_load --load_dir model --predict_file dev-v1.1.json --do_eval --gt_file dev-v1.1.json --metadata_dir bert"

The predicted answer seems to be random span, resulting in a metric like: {"exact_match": 0.47303689687795647, "f1": 4.43806570152543}. 0.47% EM means something is totally wrong.

I wonder whether I did it correctly.

And if I want to train a model to reproduce the result by myself, since I cannot get the Pretrained Model work, is it enough to just run the first step in the training section (i.e. "python run_piqa.py --train_batch_size 12 --do_train --freeze_word_emb --save_dir $SAVE1_DIR")

Thanks and hope to get your advice

mittalpatel · 2019-11-19T12:46:51Z

Hey @alphaf52 , could you find any solution for this? We are still facing the same issue.

jhyuklee · 2019-11-19T12:50:59Z

Hi, I think the problem is you forgot to give --parallel. The model is trained on DataParallel, so you have to give that option to load the model properly. Please try this and let me know.

mittalpatel · 2019-11-20T05:20:03Z

Thanks a lot @jhyuklee , this seems to be working!!! We provided --parallel while creating vectors and it is giving proper answers now. We are doing some further testing and will confirm of this soon.

Thanks once again for the hint. It really helped!

@alphaf52 You may try this solution.

yucoian · 2019-12-27T07:22:07Z

Our group can't reproduce the result for SQuAD 1.1 (as shown in Table 1 in the paper) from scratch either ! The README file does not give any interpretative statement on how to accomplish it.
Please help ... @mittalpatel @jhyuklee @eunsol @mbforbes

mittalpatel · 2019-12-30T12:13:51Z

@yucoian at what point are you facing the problem? We could do it by following the steps given in the readme.

yucoian · 2019-12-31T02:24:15Z

@mittalpatel Thank you very much! In the "SQuAD v1.1 Experiments (Section 6.1)", we cannot reimplement the "DENSPI (dense only, with Coherency scalar)" model. Could you please tell us how to adapt your released code to reproduce the result of "DENSPI (dense only, with Coherency scalar)"? To be specific, after adding coherency scalar into DENSPI，we cannot reproduce the result.

jhyuklee mentioned this issue Nov 19, 2019

Issues in setting up demo for SQuAD 1.1 data #8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How could I reproduce the result for SQuAD 1.1? #10

How could I reproduce the result for SQuAD 1.1? #10

alphaf52 commented Oct 25, 2019

mittalpatel commented Nov 19, 2019

jhyuklee commented Nov 19, 2019

mittalpatel commented Nov 20, 2019

yucoian commented Dec 27, 2019 •

edited

Loading

mittalpatel commented Dec 30, 2019

yucoian commented Dec 31, 2019 •

edited

Loading

How could I reproduce the result for SQuAD 1.1? #10

How could I reproduce the result for SQuAD 1.1? #10

Comments

alphaf52 commented Oct 25, 2019

mittalpatel commented Nov 19, 2019

jhyuklee commented Nov 19, 2019

mittalpatel commented Nov 20, 2019

yucoian commented Dec 27, 2019 • edited Loading

mittalpatel commented Dec 30, 2019

yucoian commented Dec 31, 2019 • edited Loading

yucoian commented Dec 27, 2019 •

edited

Loading

yucoian commented Dec 31, 2019 •

edited

Loading