-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support configurable summary() #624
Comments
Skimr is pretty column oriented and you're asking something row oriented. That said I think that |
I think we can go a bit further. The most useful place for this would be in the summary, i.e. Lines 12 to 14 in 22dfec2
I think the implementation depends on how far we should push this.
|
I was thinking the same thing, i.e. should we make it customizable because this might be the first of many requests to add things. I do think that for our user scenario of "someone gives you a data set and you're trying to understand it" it might be very useful. If there are a lot of duplicates it might be smart to store it in a way that reflects that. |
@michaelquinn32 if we are fixing issues on summary we could think about this one. |
This is a little more than the current updates to the |
What I was thinking is that eventually when we have a more flexible summary that would really allow a user to do this. |
Could put this on the roadmap too. Right now, the issue is that we generate all of the summary components as skimr attributes, which we then extract in the summary function. For a 3.0, we could extend
So we could require a summary function to produce
Which should give a value that is pretty similar to we currently generate. You could even think of a summary interface that is similar to skimr, basically using sfl`s.
The last part is set as a function argument, since counting column types is something we currently do on the What do you think? |
I just reread this and yes I really think that an sfl for summary would be the way to go. |
Because I am fairly incompetent, I seem to keep introducing duplicate rows into my data frames. I was wonder if, in the initial data summary bit of the output of skim(), "duplicate rows" might be a useful additional metric.
The text was updated successfully, but these errors were encountered: