Companion Data for the book The Data Preparation Journey: Finding Your Way With R
-
The book is published with CRC Press as part of The Data Science Series.
-
The web edition of the current version of the book is available here
To download and install the {dpjr} package, you will need the {remotes} package:
install.packages("remotes")
remotes::install_github("monkmanmh/dpjr")
Once you have {dpjr} installed, load it using the library()
function:
library(dpjr)
There are two groups of datasets in the package: pre-rendered? tables, and raw files.
The convenience function dpjr_data()
generates the path to the raw
data file, independent of the specific location on the user’s computer.
For example, to read the CSV file “mpg.csv”:
df_mtcars <- read.csv(dpjr::dpjr_data("mpg.csv"))
An alternative to this approach is to access files using the
system.file()
function.
Example:
system.file("extdata", package = "dpjr")
system.file("extdata", "mpg.csv", package = "dpjr")
read.csv(system.file("extdata", "mpg.csv", package = "dpjr"))
A list of the the raw data files can be found in the vignette “Data list”.
The data files in this package that are created by Martin Monkman and are licensed under
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Canada License.
Data files sourced from elsewhere are licensed under a variety of open licenses; see the “Data licenses” vignette for details.
Updated 2024-05-25