-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve network_plot and correlate for 1- and 2-column data.frames #122
Improve network_plot and correlate for 1- and 2-column data.frames #122
Conversation
As a side effect, correlate() now works with numeric vectors and one-column dfs. Close tidymodels#120. Close tidymodels#121.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this PR @antoine-sachet complete with test cases! 🚀 It's clear that not many people have tried out the vector interface and found that problem.
I'll be honest that I'm not entirely convinced that results for the 1 or 2 term case are helpful or meaningful 😜 but I read what you wrote about your use case and returning nicer results is certainly not unreasonable.
I have just a couple of changes here to make.
points <- if (ncol(rdf) == 1) { | ||
# 1 var: a single central point | ||
matrix(c(0, 0), ncol = 2, dimnames = list(colnames(rdf))) | ||
} else if (ncol(rdf) == 2) { | ||
# 2 vars: 2 opposing points | ||
matrix(c(0, -0.1, 0, 0.1), ncol = 2, dimnames = list(colnames(rdf))) | ||
} else { | ||
# More than 2 vars: multidimensional scaling to obtain x and y coordinates for points. | ||
suppressWarnings(stats::cmdscale(distance, k = 2)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of assigning to an if()
statement, can you use switch()
here?
https://adv-r.hadley.nz/control-flow.html#switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea but not feasible here as there is no way to set a default value when switching on a numeric variable (which is actually really surprising!).
For example, the following does not work as it will set points to NULL when ncol(rdf) > 3
:
points <- switch(
ncol(rdf),
# 1 var: a single central point
matrix(c(0, 0), ncol = 2, dimnames = list(colnames(rdf))),
# 2 vars: 2 opposing points
matrix(c(0, -0.1, 0, 0.1), ncol = 2, dimnames = list(colnames(rdf))),
# More than 2 vars: multidimensional scaling to obtain x and y coordinates for points.
suppressWarnings(stats::cmdscale(distance, k = 2))
)
You can set a default when switching on a character variable but I am not sure that's what you meant. I find having to cast ncol(rdf)
to character to make it work a bit weird.
points <- switch(
as.character(ncol(rdf)),
# 1 var: a single central point
"1" = matrix(c(0, 0), ncol = 2, dimnames = list(colnames(rdf))),
# 2 vars: 2 opposing points
"2" = matrix(c(0, -0.1, 0, 0.1), ncol = 2, dimnames = list(colnames(rdf))),
# More than 2 vars: multidimensional scaling to obtain x and y coordinates for points.
suppressWarnings(stats::cmdscale(distance, k = 2))
)
Please let me know if I've missed something but I think we can stick with the if/else otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I forgot that there is no default value allowed for numeric expressions. 👍
Also closes #115 |
I don't think it closes #115. I agree with this comment that a true fix would involve either a) changing The underlying problem with #115 is that it wraps More detailed examples of #115 still being an issue after this PR below this line: Two examples that I believe still give incorrect behavior:
The correlations shown are not accurate for the names of the rows/columns that they are in. The correct row names are seen here:
|
Hi, I've made the edits requested except replacing the if/else by a switch (because it's not as good a fit as it seems, see details in the discussion above). Regarding the usefulness of this PR... Yes you're right it is mostly pointless 😅 ! This is actually why I felt obliged to explain why on earth I was concerned about the 1 and 2 term cases. Now that it's implemented, I do feel it is a more satisfying behaviour to plot something that makes sense rather than fail. @thisisdaryn Agreed, I was not trying to address the use of a data.frame as |
Thanks for the reminders on what happens when |
Thank you so much for your work on this @antoine-sachet! 🚀 |
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
This PR closes #118 #119 #120 and #121. This is what now works:
Created on 2020-10-28 by the reprex package (v0.3.0)
All of those were previously failing with an error.
For context, I worked on this because I have a shiny app that lets users select columns out of a data.frame and that plots their correlations in a network plot. I wanted the plots to work even when only 1 or 2 columns were selected.
I thought this would be a straightforward change but it turns out fixing a bug uncovered a couple of others, so I ended working later than I thought :)
If I may, I just wanted to give some feedback as an outside contributor new to the code: I think you ought to refactor the code a bit and move the functions in more natural places. I had to
grep
my way through the code because the file names were not matching function names at all. In particular,cor_df.R
contains manyfoo.cor_df
S3 methods that could be in their ownfoo.R
file along with their generic function. For example, the functionfashion
has S3 methods spread across 3 generically named files when afashion.R
file would be more standard. To be clear, I think the package is great and this is meant as my 2ct of constructive criticism to foster collaboration :)I love that
network_plot
function! Keep up the great work 👍