Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entropy #27

Closed
wants to merge 11 commits into from
Closed

Entropy #27

wants to merge 11 commits into from

Conversation

andrewGhazi
Copy link
Collaborator

Function to run outlier detection by an input cluster variable clusterVar. Currently this requires unique column names to ensure that the result is returned with the columns in the same order.

@lgeistlinger
Copy link
Contributor

Thanks @andrewGhazi. A couple of points

  • I am confused that this pull request is named "Entropy" as this seems to rather be functionality for "Outlier detection"
  • The latest commit doesn't pass R CMD check and fails due to missing dependencies. As we are preparing for submission to Bioconductor, going forward I'd like pull requests to always pass R CMD check.
  • I was wondering whether it is necessary to have a separate function calculateOutlierScoreByCluster? My preference would be for the general calculateOutlierScore to have a group.var argument that is NULL by default and will then compute outlier scores across the dataset. Or if group.var is a categorical variable in the colData, where cluster membership is just one (particularly useful) example, the outlier scores would then be computed by group.
  • I would like us to separate calculations and plotting in general and for the calculateOutlierScore function in particular. I think the plotting via scater::PlotReducedDim here can just go into the vignette, but if you feel strongly about having it as a function, then let's put it in a function plotOutlierScore that consumes the output of calculateOutlierScore.
  • Please no piping (|>) in package code (my personal preference)

@andrewGhazi
Copy link
Collaborator Author

andrewGhazi commented Apr 17, 2024

  • Oops I think it's called entropy by default because I was committing from the same branch that I implemented the entropy functionality from.
  • Not sure why R CMD check is failing here, none of those dependencies are new and some aren't even used in this function...
  • group.var argument makes sense, I'll switch it around.
  • plotOutlierScore seems fine, I'll remove the plotting behavior and add it there. I had forgotten how thin the plot wrapper was. Yeah, just demonstrating with scater::plotReducedDim is better, no need for a special function.
  • Okay, I'll remove the pipes.

@lgeistlinger
Copy link
Contributor

lgeistlinger commented Apr 17, 2024

Not sure why R CMD check is failing here, none of those dependencies are new and some aren't even used in this function...

It seems your entropy branch doesn't have the latest updates from Nitesh and me? That is because the DESCRIPTION file looks like the old one on the entropy branch which is causing these NAMESPACE vs DESCRIPTION file issues. I think you need to pull these changes first, before pushing new stuff on top of it.

@lgeistlinger
Copy link
Contributor

I guess this can be closed @andrewGhazi?

@andrewGhazi andrewGhazi deleted the entropy branch April 19, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants