Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

augment epidist_db()'s print output to orient new users #302

Closed
papsti opened this issue May 16, 2024 · 4 comments
Closed

augment epidist_db()'s print output to orient new users #302

papsti opened this issue May 16, 2024 · 4 comments
Assignees
Milestone

Comments

@papsti
Copy link

papsti commented May 16, 2024

when getting oriented to the epidist database using epidist_db(), it isn't immediately obvious which diseases, pathogens, and epi distributions are available for selection (i.e., for input into the first three arguments of epidist_db()).

currently, executing epidist_db() gives a helpful print message summarising the contents of the database like:

Returning 122 results that match the criteria (99 are parameterised).  
Use subset to filter by entry variables or single_epidist to return a single entry.
To retrieve the citation for each use the 'get_citation' function
List of <epidist> objects
  Number of entries in library: 122
  Number of studies in library: 47
  Number of diseases: 23
  Number of delay distributions: 112
  Number of offspring distributions: 10

there are two ways i can see augmenting this statement to help new users get started.

  1. additionally print labels available for input into epidist_db() for the disease, pathogen and epi_dist fields. this can potentially (repeatedly) induce too much visual clutter depending on the number of unique labels available.

  2. add a line to the print statement above pointing to the URL for a vignette online exploring this database (suggested in Database vignette #296 but for a different purpose)

@PaulC91
Copy link

PaulC91 commented May 16, 2024

This was also my first thought when using the package for the first time and I can see we are not alone (#277).

I've added the list of diseases to the print method in PaulC91@1ed7046 but perhaps this makes it too long and a function to print available disease, pathogen and epi_dist options would be a better UX @joshwlambert ?

db <- epiparameter::epidist_db()
#> Returning 122 results that match the criteria (99 are parameterised). 
#> Use subset to filter by entry variables or single_epidist to return a single entry. 
#> To retrieve the citation for each use the 'get_citation' function
db
#> List of <epidist> objects
#>   Number of entries in library: 122
#>   Number of studies in library: 47
#>   Number of diseases: 23
#>   Number of delay distributions: 112
#>   Number of offspring distributions: 10
#> 
#>   Diseases available:
#>   Adenovirus
#>   Chikungunya
#>   COVID-19
#>   Dengue
#>   Ebola Virus Disease
#>   Hantavirus Pulmonary Syndrome
#>   Human Coronavirus
#>   Influenza
#>   Japanese Encephalitis
#>   Marburg Virus Disease
#>   Measles
#>   MERS
#>   Mpox
#>   Parainfluenza
#>   Pneumonic Plague
#>   Rhinovirus
#>   Rift Valley Fever
#>   RSV
#>   SARS
#>   Smallpox
#>   West Nile Fever
#>   Yellow Fever
#>   Zika Virus Disease

Created on 2024-05-16 with reprex v2.0.2

@jfunction
Copy link

Pretty much the same thought here - I wondered how I would know to look for COVID-19 instead of COVID19 or some other name.

Maybe something with tab completion would be nice, maybe with attributes on the epidist_db object which comes back?

db <- epidist_db()
db$diseases$<tab>

and/or something like Paul recommended:

db <- epidist_db()
dis <- pull_diseases(db)
pat <- pull_pathogen(db)
# Then print these or use tab completion to repeat the call, subsetting with:
dbCov <- epidist_db(disease=dis$<tab>)

this would allow you do filter on options available within the subset you pass in which I think is rather nice.

@joshwlambert
Copy link
Member

Thank you for all the suggestions! I've taken them onboard and updated the printing for the output of epidist_db().

Some specific points that have been incorporated from your comments:

  • The diseases and epidemiological parameters that are returned from epidist_db() are now listed in the header of the printout. In the example below the diseases and epi parameters are printed on a single line, but in an IDE (e.g. RStudio) these will be shown over multiple lines (depending on how many diseases are returned).

  • A link has been added to the online database in the footer of the printout. This should help if users want to navigate the database online before coming to R to handle the data.

Some additional details, the header states how many elements are returned by epidist_db() and the footer shows how many more are not shown by the preview and gives a hint to use print(n = ...) to show more or fewer elements in the preview and to use the parameter_tbl() function to see the results in a tabular format.

library(epiparameter)
ed = epidist_db()
#> Returning 122 results that match the criteria (99 are parameterised). 
#> Use subset to filter by entry variables or single_epidist to return a single entry. 
#> To retrieve the citation for each use the 'get_citation' function

ed
#> # List of <epidist> objects 
#> # A list:  122 elements
#> 
#> Number of diseases: 23
#> ❯ Adenovirus ❯ Chikungunya ❯ COVID-19 ❯ Dengue ❯ Ebola Virus Disease ❯ Hantavirus Pulmonary Syndrome ❯ Human Coronavirus ❯ Influenza ❯ Japanese Encephalitis ❯ Marburg Virus Disease ❯ Measles ❯ MERS ❯ Mpox ❯ Parainfluenza ❯ Pneumonic Plague ❯ Rhinovirus ❯ Rift Valley Fever ❯ RSV ❯ SARS ❯ Smallpox ❯ West Nile Fever ❯ Yellow Fever ❯ Zika Virus Disease
#> 
#> 
#> Number of epi distributions: 12
#> ❯ generation time ❯ hospitalisation to death ❯ hospitalisation to discharge ❯ incubation period ❯ notification to death ❯ notification to discharge ❯ offspring distribution ❯ onset to death ❯ onset to discharge ❯ onset to hospitalisation ❯ onset to ventilation ❯ serial interval
#> 
#> 
#> [[1]]
#> Disease: Adenovirus
#> Pathogen: Adenovirus
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-6
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-6>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 1.247
#>   sdlog: 0.975
#> 
#> [[2]]
#> Disease: Human Coronavirus
#> Pathogen: Human_Cov
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-7
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-7>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 0.742
#>   sdlog: 0.918
#> 
#> [[3]]
#> Disease: SARS
#> Pathogen: SARS-Cov-1
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-8
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-8>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 0.660
#>   sdlog: 1.205
#> 
#> # ℹ 119 more elements
#> # ℹ Use `print(n = ...)` to see more elements.
#> # ℹ Use `parameter_tbl()` to see a summary table of the parameters.
#> Explore database online at
#> <https://epiverse-trace.github.io/epiparameter/dev/articles/database.html>

Created on 2024-06-03 with reprex v2.1.0

This new functionality is implemented in PR #326, please feel free to provide feedback and I will leave this PR open until the end of the week.

For those suggestions that have not been addressed by this PR, I will tackle separately.

@joshwlambert
Copy link
Member

Moving this issue to Done in the v0.2.0 project as I don't plan to make any more changes with respect to these discussions before the next release.

I'm leaving the issue open as although the bulk of the issue is addressed by PR #326, there are some good points raised about enabling autocomplete to list diseases or pull the diseases and pathogens, which can be implemented in an upcoming version.

@joshwlambert joshwlambert moved this from In Progress to Done in {epiparameter} v0.2.0 Jun 27, 2024
@joshwlambert joshwlambert closed this as completed by moving to Done in {epiparameter} v0.2.0 Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants