-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerun Pre-Doc2Vec for first phase using the entire corpus #8
Comments
@ljgarcia and @rohitharavinder : I executed the modified codes by Rohitha (according to this issue) and I uploaded the new results at the end of this sheet: Note that I just got the results for first 6 hyperparameter sets.
Now I should explain how I got my first set of results from almost the same codes:
def save_doc2vec_embeddings(model: Doc2Vec, pmids: List[int], output_path: str, param_iteration: int):
and also the following changes if name == "main":
|
Today I went through the new modified code and it turns out that the script |
@ljgarcia and @rohitharavinder : Today at the meeting Leyla suggested a great idea to check the two groups of codes; let's say compact-code (yielding similar results as script And I did this test:
|
Please consider the following brief report @ljgarcia , @rohitharavinder , @endri16-lab :
I reviewed my previously generated results and discovered that all the inserted P@5 scores for the three-classes-precision case were replaced with the P@5 score of the first hyperparameter set: I may have accidentally stretched the first cell of the first column in the spreadsheet ... The Values in the Spreadsheet have now been Corrected!
However I reproduced the results by using the new code, which follows the style of Suhasini's training codes. Specifically it includes the test phase of Suhasini's training codes inside a for-loop iterating over different hyperparameter sets. Additionally in the function
generate_embeddings
ofutilities.py
I replacedmodel.infer_vector(doc)
withmodel.docvecs[str(pmid)]
, sincemodel.docvecs
provides access to pre-learned document vectors from the training corpus, whilemodel.infer_vector
generates vector representations for new, unseen documents: You may find the code I used and the documentation here.I uploaded the new results in the Spreadsheet after the results produced by Leyla: As you may see my reproduced results are pretty similar to my previous results.
I have no idea why my results are different form Leyla's and Endri's results. Is it possible that I used a different JSON file, hence different hyperparameters, to generate data compared to the one you used?" I used the same JSON file that I used for Word2Vec models where
sg
was replaced bydm
.The text was updated successfully, but these errors were encountered: