dbreport

dbreport is a R-package that let’s you create automated reports of database tables or R-objects (data.table, data.frame, tibble), by summarising table meta data and counting distinct entries for all (selected) columns. Therefore you can keep track of your ever growing database tables in your project and avoid constantly running queries such as the one below, checking which distinct entries your columns have (at least, this was my motivation for this package 😄).

SELECT var1, count(var1) AS n
FROM tab1
GROUP BY var1
ORDER BY n DESC

For now PostgreSQL, MySQL and SQLite databases are supported. Adding methods for additional database libraries are welcomed!

Installation

remotes::install_github('andschar/dbreport') # not yet on CRAN

Quickstart

We load dbreport and connect to our database (con).

require(dbreport)
con = DBI::dbConnect(drv = RPostgres::Postgres(),
                     dbname = 'myname',
                     host = 'host',
                     port = 'port',
                     user = 'user',
                     password = 'mysecretpassword')

Now we can create a report of any table therein. Let’s just do this for the information_schema.tables table which is created by default for any PostgreSQL or MySQL database. In doing so, we choose to provide an output directory (output_dir =), to provide a super meaningful title (report_title =) and to add some introductory text (report_text =) to the report. As our output format (output_format =) we choose 'html_document'. We could also take other rmarkdown formats, such as 'pdf_document', 'github_document', 'html_vignette' (for a very lightweight output) etc.

dbreport(con = con,
         schema = 'information_schema',
         tbl = 'tables',
         output_dir = '/home/user/project1',
         report_title = 'Why not create an information_schem.tables report?!',
         report_text = 'Introductory text.', # you can alsp provide a file
         output_format = 'html_document')

Done! The report together with a .csv file summarising the individual columns is created in the output directory and we can examine it.

More options

Specific columns

We can also tweak our report a little more and choose only specific columns (column =) to be included in the report and create treemap plot instead of the lollipop plot (plot_type =). For this we access the publicly available Read-Only MySQL RNA-sequence database and run dbreport() only on the ncbi_id and species column of the taxonomy table. In doing so we also output the report to a .pdf (output_format =).

con = DBI::dbConnect(drv = RMySQL::MySQL(),
                     host = 'mysql-rfam-public.ebi.ac.uk',
                     user = 'rfamro',
                     port = 4497,
                     dbname = 'Rfam')

dbreport(con,
         schema = 'Rfam',
         tbl = 'taxonomy',
         column = c('ncbi_id', 'species'),
         plot_type = 'treemap',
         output_dir = file.path(tempdir(), 'test_mysql'),
         output_format = 'pdf_document')

By default dbreport() closes the connection to a database and you would have to reopen it for every new report you want to create. You can turn off this behavior by setting exit = FALSE. This can be especially useful if you want to create reports for multiple tables automatically, for instance by using mapply(). After having created the reports however, you should run DBI::dbDisconnect(con) to close the open database connections.

mapply(dbreport::dbreport,
       tbl = c('table1', 'table2'), # vector of table names
       MoreArgs = list(con = con,
                       schema = 'schema1',
                       output_dir = tempdir(),
                       output_format = 'html_vignette',
                       exit = FALSE)
DBI::dbDisconnect(con)

Column entries

We could as well check how often the entry versiocolor occurs (case is ignored) in the Species column of the iris table by specifying the entry = parameter.

dbreport(con = iris,
         entry = 'versicolor',
         output_dir = file.path(tempdir(), 'report_iris'),
         output_format = 'html_vignette',
         report_title = 'Iris report')

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
R		R
docs		docs
inst/rmd		inst/rmd
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dbreport

Installation

Quickstart

More options

Specific columns

Column entries

About

Releases

Packages

Languages

License

andschar/dbreport

Folders and files

Latest commit

History

Repository files navigation

dbreport

Installation

Quickstart

More options

Specific columns

Column entries

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages