-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read through the code and give questions #6
Comments
Training: For the training section in, src/gtp/train_whole_genome.py, I have a couple of questions: firstly, in the step where you initialize the variables to accumulate the Root Mean Square Error (RMSE) for the current epoch, I see that you use Pierce's coefficient for determining the accuracy, and I'd like to ask how you dealt with the possibility of potentially influencing the presence of outliers in the data? Are we pre-filtering the genetic data before training the model? Or do we keep all the genes going for training to ensure the integrity of the data? Evaluation:In the evaluation section, I noticed that three functions were used: get_shapley_sampling_attr, get_guided_gradcam_attr, and get_saliency_attr. Based on my understanding and analysis, get_shapley_sampling_attr uses the Shapley Values Sampling method to calculate the model's impact on each feature in the input data. It iterates through each batch in the DataLoader, calculates the Shapley Value of each feature and sums it up to get the total impact; the get_guided_gradcam_attr function uses the Guided Grad-CAM method to calculate the model's impact on each pixel in the input data. It iterates through each batch in the DataLoader, calculates the gradient information for each pixel and sums it up to get the total impact; the get_saliency_attr function uses the Saliency method to evaluate the pixel impact as well, and my question is whether the get_guided_gradcam_attr and get_saliency_attr functions are the same as the get_guided_gradcam_attr function. attr are both used to evaluate pixels, so do they have different evaluation dimensions? Do their evaluation results show some kind of linear correlation? Porcess: |
|
[training]: A question I have about the training code is about the model saving. When does the model save on loss and when would it save on pearson correlation? Also, do you think it would be helpful to learn about the structure of the SoyBeanNet model and how it works? [evaluation]: The whole idea of the evaluation code is something that I am not experienced with so I would like to know if my interpretation of the code is correct. The "get_attribution_points" function uses occlusion to compute attributions. This means that certain parts of the input data are masked to see how the model's output changes to see what the most important parts of the input are. The "get_shapley_sampling_attr" function utilizes shapley value sampling. I am unfamiliar with this and would like to know how this sampling works. The Guided Grad-CAM method is used in the functions "get_guided_gradcam_attr" and in "get_guided_gradcam_attr_test". What is the difference between these two functions? The "get_saliency_attr"function computes attributions using the saliency method. Again, I am not very familiar with how this method works. Also, what does the attribution graph look like? [preprocessing]: I do not have too many questions, but am I right to assume that the code in the run_pipeline.ipynb processes phenotype data while run_pipeline.ipynb processes genotype data? |
Great questions!
|
Great questions!
|
src/gtp/train_whole_genome.py
and try to follow the logic #7src/gtp/train_whole_genome.py
#8notebooks/run_pipeline.ipynb
and try to follow the logicThe text was updated successfully, but these errors were encountered: