-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run PLSR plot #101
Run PLSR plot #101
Conversation
@JacksonLChin This is the code I'm using to run the PLSR plots. Let me know if you find any glaring issues! I based this file mostly by your D8 file. |
covid_acc = score( | ||
labels.loc[meta_data.loc[:, "patient_category"] == "COVID-19"], | ||
probabilities.loc[meta_data.loc[:, "patient_category"] == "COVID-19"], | ||
labels.loc[meta_data.loc[:, "patient_category"] == "COVID-19"].to_numpy().astype(int), | ||
probabilities.loc[meta_data.loc[:, "patient_category"] == "COVID-19"].to_numpy(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid typing headaches later, let's use to_numpy()
for either all of the probabilities
variables or none of them.
@andrewram4287 Looks good! Just one minor change for type handling. |
Other thing to note--not sure if this is the same plot you showed in lab meeting yesterday, but the |
Okay that makes sense I can look into what the prediction if we actually combine all the patient samples. Yes, the code creates the same plot I showed in lab meeting.... So that means accuracy does decrease with more PLSR components... |
Yup! We may want to swap to the 1-component model for each then. As Dr. Meyer suggested yesterday, I think we could move to a single scores plot with the COVID and non-COVID PLSR components as the x and y axes for interpretation. |
I find it weird though that when you ran the PLSR model before that was not the case... |
In the past, we've found the non-COVID model worked best at one component, and the COVID one worked slightly better at two. I think this is mostly in line with what we've seen previously--I'm guessing that the change to how we normalize factors prior to PLSR changed this a little bit. |
@JacksonLChin |
@andrewram4287 Cool! Thanks for looking into this. It's interesting that 1-component seems better for accuracy while 2 is better for AUC-ROC--given that they're comparable, though, I think we should continue with the 1-component. |
No description provided.