Add How to Reproduce the Result in README #2

atnanahidiw · 2020-09-26T20:44:43Z

Refactor:

run_single_task.sh
run_all_tasks.sh
run_non_pretrained_no_special_token.sh

by:

Create scripts/config/model/train.yaml to easily define the model used
Make scripts/reproducer.py and Makefile for easy ops

Note: only tested this by printing the command string, sorry 🙏

SamuelCahyawijaya

@gentaiscool : I think we need to change the load_model and clean up the unused model checkpoints that are used for our experiment. We can instead use Indobenchmark model in HF as the replacement. Can you help to clean that part?

@atnanahidiw, later you can follow the checkpoint argument based on the finalized models that we have. So it will be cleaner and easier to follow.

README.md

gentaiscool · 2020-10-07T08:39:04Z

@gentaiscool : I think we need to change the load_model and clean up the unused model checkpoints that are used for our experiment. We can instead use Indobenchmark model in HF as the replacement. Can you help to clean that part?

@atnanahidiw, later you can follow the checkpoint argument based on the finalized models that we have. So it will be cleaner and easier to follow.

I will clean the code and add the documentation this week including the CONTRIBUTING page. Let's finish the testing and then, we merge this pull request.

gentaiscool · 2020-10-09T14:02:24Z

@gentaiscool : I think we need to change the load_model and clean up the unused model checkpoints that are used for our experiment. We can instead use Indobenchmark model in HF as the replacement. Can you help to clean that part?
@atnanahidiw, later you can follow the checkpoint argument based on the finalized models that we have. So it will be cleaner and easier to follow.

I will clean the code and add the documentation this week including the CONTRIBUTING page. Let's finish the testing and then, we merge this pull request.

I added the CONTRIBUTING page in the master branch.

gentaiscool · 2020-10-10T17:36:07Z

@atnanahidiw we just merged the master branch with a new PR #7. Would you mind to check whether this PR has any conflict?

atnanahidiw · 2020-10-11T09:38:29Z

hi @gentaiscool, I just resolved all of the conflicts ya, sorry for the late reply 🙏

atnanahidiw · 2020-10-11T09:41:22Z

I added the CONTRIBUTING page in the master branch.

and thanks also for the CONTRIBUTING page
should we make it more verbose? ^^a

gentaiscool · 2020-10-11T14:08:52Z

I added the CONTRIBUTING page in the master branch.

and thanks also for the CONTRIBUTING page
should we make it more verbose? ^^a

Yes, we should, probably in a new PR?

SamuelCahyawijaya

@atnanahidiw : We have some update regarding to the model list, can you help removing the unnecessary models and adding the IndoBERT model. I write the comment for all the models that can be removed and also the list of IndoBERT models.

SamuelCahyawijaya · 2020-10-11T13:54:09Z

scripts/config/model/train.yaml

+# lower: 
+# num_layers: 
+
+# # albert-base-uncased-96000


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:54:16Z

scripts/config/model/train.yaml

+#   num_layers:
+#     - 12
+
+# # albert-base-uncased-96000-spm


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:54:20Z

scripts/config/model/train.yaml

+#     - 12
+
+# # albert-base-uncased-96000-spm
+# - model_checkpoint: albert-base-uncased-96000-spm


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:54:26Z

scripts/config/model/train.yaml

+#   num_layers:
+#     - 12
+
+# # albert-base-uncased-112500-spm


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:54:31Z

scripts/config/model/train.yaml

+#     - 12
+
+# scratch
+- model_checkpoint: scratch


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:55:40Z

scripts/config/model/train.yaml

+    - 24
+
+# babert-bpe-mlm-large-uncased-1100k
+- model_checkpoint: babert-bpe-mlm-large-uncased-1100k


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:55:44Z

scripts/config/model/train.yaml

+    - 24
+
+# babert-bpe-mlm-uncased-128-dup10-5
+- model_checkpoint: babert-bpe-mlm-uncased-128-dup10-5


This model can be removed

SamuelCahyawijaya · 2020-10-11T13:59:14Z

Makefile

+	python3 scripts/reproducer.py term-extraction-airy 15 $(BATCH_SIZE) $(HYPERPARAMETER)
+	python3 scripts/reproducer.py pos-prosa 15 $(BATCH_SIZE) $(HYPERPARAMETER)
+
+reproduce_all_1:


We can remove reproduce_all_*, it is already covered in reproduce and reproduce_all for this

SamuelCahyawijaya · 2020-10-11T14:01:18Z

Makefile

+run_non_pretrained_no_special_token:
+	python3 scripts/reproducer_non_pretrained.py $(DATASET) $(EARLY_STOP) $(BATCH_SIZE)
+
+run_non_pretrained_no_special_token_all:


There are 8 tasks in here, can you please help adding the other 4 similar to the list in the reproduce_all?

SamuelCahyawijaya · 2020-10-11T14:06:33Z

scripts/config/model/train.yaml

+- model_checkpoint: babert-bpe-mlm-uncased-128-dup10-5
+  lower: True
+  num_layers:
+    - 12


Can you help adding 8 indoBERT models in this file, the model checkpoint and the num_layers would be as follow:

indobenchmark/indobert-base-p1 | 12 layers

indobenchmark/indobert-base-p2 | 12 layers

indobenchmark/indobert-large-p1 | 24 layers

indobenchmark/indobert-large-p2 | 24 layers

indobenchmark/indobert-lite-base-p1 | 12 layers

indobenchmark/indobert-lite-base-p2 | 12 layers

indobenchmark/indobert-lite-large-p1 | 24 layers

indobenchmark/indobert-lite-large-p2 | 24 layers

SamuelCahyawijaya · 2020-10-11T14:11:56Z

I added the CONTRIBUTING page in the master branch.

and thanks also for the CONTRIBUTING page
should we make it more verbose? ^^a

Yes, we should, probably in a new PR?

Yeah I agree, this one can be a new PR. Thank you 😀

gentaiscool self-assigned this Sep 29, 2020

gentaiscool added the documentation Improvements or additions to documentation label Sep 29, 2020

gentaiscool requested a review from SamuelCahyawijaya October 3, 2020 10:32

atnanahidiw changed the title ~~Add How to Train in README~~ Add How to Reproduce the Result in README Oct 3, 2020

SamuelCahyawijaya requested changes Oct 6, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

SamuelCahyawijaya assigned atnanahidiw Oct 6, 2020

atnanahidiw requested a review from SamuelCahyawijaya October 6, 2020 14:00

atnanahidiw added 6 commits October 11, 2020 16:34

add trainer + make for simplifying usage

dc4c227

update readme

14791a5

Change added readme section to Reproduce Result

5648a99

change trainer to reproducer for clearer intention

877a58e

minor changes

8b5ca15

add reproduce_all tasks

6a149a4

atnanahidiw force-pushed the add-readme-trainer branch from c312f35 to 6a149a4 Compare October 11, 2020 09:36

SamuelCahyawijaya requested changes Oct 11, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add How to Reproduce the Result in README #2

Add How to Reproduce the Result in README #2

atnanahidiw commented Sep 26, 2020 •

edited

Loading

SamuelCahyawijaya left a comment

gentaiscool commented Oct 7, 2020

gentaiscool commented Oct 9, 2020

gentaiscool commented Oct 10, 2020

atnanahidiw commented Oct 11, 2020

atnanahidiw commented Oct 11, 2020

gentaiscool commented Oct 11, 2020

SamuelCahyawijaya left a comment

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya Oct 11, 2020

SamuelCahyawijaya commented Oct 11, 2020

Add How to Reproduce the Result in README #2

Are you sure you want to change the base?

Add How to Reproduce the Result in README #2

Conversation

atnanahidiw commented Sep 26, 2020 • edited Loading

SamuelCahyawijaya left a comment

Choose a reason for hiding this comment

gentaiscool commented Oct 7, 2020

gentaiscool commented Oct 9, 2020

gentaiscool commented Oct 10, 2020

atnanahidiw commented Oct 11, 2020

atnanahidiw commented Oct 11, 2020

gentaiscool commented Oct 11, 2020

SamuelCahyawijaya left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SamuelCahyawijaya commented Oct 11, 2020

atnanahidiw commented Sep 26, 2020 •

edited

Loading