Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank plot with grid lines shown #32

Open
MilesMcBain opened this issue Oct 3, 2016 · 10 comments
Open

Blank plot with grid lines shown #32

MilesMcBain opened this issue Oct 3, 2016 · 10 comments
Milestone

Comments

@MilesMcBain
Copy link

MilesMcBain commented Oct 3, 2016

I've encountered a couple of moderately sized dataframes (40 000 x 130) where vis_miss and vis_dat fail. A plain white background with grey grid lines is all that is plotted:

visdat_bug

Dataset for this plot: https://drive.google.com/file/d/0B7688WPR38x2N2tiSk9HYWhVRTg/view?usp=sharing

@njtierney
Copy link
Collaborator

Ooo. Weird. I'll check that out soon.

@njtierney
Copy link
Collaborator

Hmm, looks like it is working now? Does this reproduce on your machine?

# system.time(loan <- read.csv("~/Downloads/CUSTOMER_LOAN.csv"))

loan <- readr::read_csv("~/Downloads/CUSTOMER_LOAN.csv")
#> Parsed with column specification:
#> cols(
#>   .default = col_character(),
#>   id = col_integer(),
#>   member_id = col_integer(),
#>   loan_amnt = col_integer(),
#>   funded_amnt = col_integer(),
#>   funded_amnt_inv = col_double(),
#>   int_rate = col_double(),
#>   installment = col_double(),
#>   annual_inc = col_double(),
#>   dti = col_double()
#> )
#> See spec(...) for full column specifications.

dim(loan)
#> [1] 42456    25

visdat::vis_miss(loan)

visdat::vis_dat(loan)

@njtierney
Copy link
Collaborator

I'm assuming this is fine now, let me know if there are any problems @MilesMcBain !

@Phu2
Copy link

Phu2 commented Mar 19, 2018

I also have a blank plot with this dataframe: https://drive.google.com/open?id=1IfMHz2ElCklgXjBEZVoeUUmOo6106YMr
bildschirmfoto vom 2018-03-20 00-36-46

Just tried with another file: https://drive.google.com/open?id=1L11znzqmmXhkB7OsERI3lgDhMBXuw0vV and it works fine!

@Phu2
Copy link

Phu2 commented Mar 20, 2018

Just updated visdat package from 0.1.0 to 0.2.2.9200 a few hours ago. Selected only 2 variables (marc_153_a_ss, marc_084_a_ss) from the file linked above. visdat manages to plot up to ~32765 rows, it fails with a blank plot when i am trying out tibbles with more rows. @njtierney can you help?

@njtierney
Copy link
Collaborator

njtierney commented Mar 20, 2018

Hi there @Phu2

For the moment it seems that this bug is related to processor speed and memory on a computer - so this is hard to generalise what the problem is and fix it.

Future approaches with plotting for visdat (see #65 and #59) will hopefully help with this, but this probably won't be in a release for at least the next 6 weeks.

In the interim, I would recommend downsampling your data using something like

library(visdat)
library(dplyr)
data %>%
  sample_n(size = 1000) %>%
  vis_dat()

to take a random sample of 1000 of the data and plot it

or look at the first 1000 rows like so:

data %>%
  slice(1:1000) %>%
  vis_dat()

@njtierney njtierney reopened this Mar 20, 2018
@Phu2
Copy link

Phu2 commented Mar 20, 2018

Thanks! I have a lot of datasets with more than 50000 rows for which i like to plot the missing values. So downsampling ist not the right approach for me. I'll give visdat a try on another machine.

@njtierney
Copy link
Collaborator

Sorry I can't be more help!

If you are interested in exploring missing data, you can also look at naniar - which has more dedicated functions for exploring missing data.

@Phu2
Copy link

Phu2 commented Mar 20, 2018

No problem, i will have a look at naniar. Thank you for your work!

@njtierney njtierney added this to the V0.2.0 milestone Jun 5, 2018
@njtierney njtierney removed the V0.5.0 label Jun 6, 2018
@njtierney
Copy link
Collaborator

I'm not sure on a solution for this, so I am going to move it to another milestone, and then close it after that milestone is achieved (around August).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants