-
Notifications
You must be signed in to change notification settings - Fork 47
/
Copy pathindex.Rmd
270 lines (199 loc) · 8.66 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
---
output: github_document
---
```{r setup, include = FALSE}
auth_success <- tryCatch(
googledrive:::drive_auth_docs(),
googledrive_auth_internal_error = function(e) e
)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
error = TRUE,
purl = googledrive::drive_has_token(),
eval = googledrive::drive_has_token()
)
```
```{r eval = !googledrive::drive_has_token(), echo = FALSE, comment = NA}
googledrive:::drive_bullets(c(
"Code chunks will not be evaluated, because:",
strsplit(auth_success$message, split = "\n")[[1]]
))
googledrive::drive_deauth()
```
# googledrive
<!-- badges: start -->
[![CRAN status](https://www.r-pkg.org/badges/version/googledrive)](https://CRAN.R-project.org/package=googledrive)
[![R-CMD-check](https://github.com/tidyverse/googledrive/workflows/R-CMD-check/badge.svg)](https://github.com/tidyverse/googledrive/actions)
[![Codecov test coverage](https://codecov.io/gh/tidyverse/googledrive/branch/main/graph/badge.svg)](https://codecov.io/gh/tidyverse/googledrive?branch=main)
<!-- badges: end -->
## Overview
googledrive allows you to interact with files on Google Drive from R.
## Installation
Install from CRAN:
```{r, eval = FALSE}
install.packages("googledrive")
```
## Usage
### Load googledrive
```{r}
library("googledrive")
```
```{r drive-setup, eval = FALSE, include = FALSE}
# This chunk contains code to setup all the files necessary for this document
# to make sense, starting from a blank slate. Likewise, it contains code to
# delete those same files.
#
# It is meant to be run occasionally, interactively, by a human maintainer.
# So far, I can't think of anywhere better to put this.
#
# The visible, executable chunks may also create files, in which case the
# necessary clean up code shall also be visible and executable.
CLEAN <- SETUP <- FALSE
examples <- drive_examples_remote()
builtin <- c(
system.file("DESCRIPTION"),
R.home("doc/html/Rlogo.svg"),
R.home("doc/BioC_mirrors.csv"),
R.home("doc/THANKS")
)
if (isTRUE(SETUP)) {
purrr::map2(
examples$id, examples$name,
~ drive_cp(as_id(.x), name = .y)
)
purrr::map(builtin, ~ drive_upload(.x))
drive_mkdir("abc")
abc_def <- drive_mkdir("abc/def")
drive_upload(
system.file("NEWS.md", package = "googledrive"),
path = abc_def,
name = "googledrive-NEWS.md"
)
THANKS <- drive_get("THANKS")
r_logo <- drive_get("r_logo.jpg")
x <- list(THANKS, r_logo)
purrr::map(x, ~ drive_update(.x, starred = TRUE))
purrr::map(x, ~ drive_share(.x, role = "reader", type = "anyone"))
}
if (isTRUE(CLEAN)) {
drive_rm(examples)
drive_rm(basename(builtin))
drive_rm("abc/")
drive_rm(drive_find("index-chicken"))
}
```
### Package conventions
* Most functions begin with the prefix `drive_`. Auto-completion is your friend.
* Goal is to allow Drive access that feels similar to Unix file system utilities, e.g., `find`, `ls`, `mv`, `cp`, `mkdir`, and `rm`.
* The metadata for one or more Drive files is held in a `dribble`, a "Drive tibble". This is a data frame with one row per file. A dribble is returned (and accepted) by almost every function in googledrive. Design goals:
- Give humans what they want: the file name
- Track what the API wants: the file ID
- Hold on to all the other metadata sent back by the API
* googledrive is "pipe-friendly" and, in fact, re-exports `%>%`, but does not require its use.
### Quick demo
Here's how to list up to `n_max` of the files you see in [My Drive](https://drive.google.com). You can expect to be sent to your browser here, to authenticate yourself and authorize the googledrive package to deal on your behalf with Google Drive.
```{r}
drive_find(n_max = 30)
```
You can narrow the query by specifying a `pattern` you'd like to match names against. Or by specifying a file type: the `type` argument understands MIME types, file extensions, and a few human-friendly keywords.
```{r eval = FALSE}
drive_find(pattern = "chicken")
drive_find(type = "spreadsheet") ## Google Sheets!
drive_find(type = "csv") ## MIME type = "text/csv"
drive_find(type = "application/pdf") ## MIME type = "application/pdf"
```
Alternatively, you can refine the search using the `q` query parameter. Accepted search clauses can be found in the [Google Drive API documentation](https://developers.google.com/drive/v3/web/search-parameters). For example, to see all files that you've starred and that are readable by "anyone with a link", do this:
```{r}
(files <- drive_find(q = c("starred = true", "visibility = 'anyoneWithLink'")))
```
You generally want to store the result of a googledrive call, as we do with `files` above. `files` is a dribble with info on several files and can be used as the input for downstream calls. It can also be manipulated as a regular data frame at any point.
#### Identify files
`drive_find()` searches by file properties, but you can also identify files by name (path, really) or by Drive file id using `drive_get()`.
```{r}
(x <- drive_get("~/abc/def/googledrive-NEWS.md"))
```
`as_id()` can be used to convert various inputs into a marked vector of file ids. It works on file ids (for obvious reasons!), various forms of Drive URLs, and `dribble`s.
```{r}
x$id
# let's retrieve same file by id (also a great way to force-refresh metadata)
drive_get(x$id)
drive_get(as_id(x))
```
In general, googledrive functions that operate on files allow you to specify the file(s) by name/path, file id, or in a `dribble`. If it's ambiguous, use `as_id()` to mark a character vector as holding Drive file ids as opposed to file paths. This function can also extract file ids from various URLs.
#### Upload files
We can upload any file type.
```{r}
(chicken <- drive_upload(
drive_example_local("chicken.csv"),
"index-chicken.csv"
))
```
Notice that file was uploaded as `text/csv`. Since this was a `.csv` document, and we didn't specify the type, googledrive guessed the MIME type. We can overrule this by using the `type` parameter to upload as a Google Spreadsheet. Let's delete this file first.
```{r}
drive_rm(chicken)
# example of using a dribble as input
chicken_sheet <- drive_example_local("chicken.csv") %>%
drive_upload(
name = "index-chicken-sheet",
type = "spreadsheet"
)
```
Much better!
#### Share files
To allow other people to access your file, you need to change the sharing permissions. You can check the sharing status by running `drive_reveal(..., "permissions")`, which adds a logical column `shared` and parks more detailed metadata in a `permissions_resource` variable.
```{r}
chicken_sheet %>%
drive_reveal("permissions")
```
Here's how to grant anyone with the link permission to view this data set.
```{r}
(chicken_sheet <- chicken_sheet %>%
drive_share(role = "reader", type = "anyone"))
```
This comes up so often, there's even a convenience wrapper, `drive_share_anyone()`.
#### Publish files
Versions of Google Documents, Sheets, and Presentations can be published online. You can check your publication status by running `drive_reveal(..., "published")`, which adds a logical column `published` and parks more detailed metadata in a `revision_resource` variable.
```{r}
chicken_sheet %>%
drive_reveal("published")
```
By default, `drive_publish()` will publish your most recent version.
```{r}
(chicken_sheet <- drive_publish(chicken_sheet))
```
#### Download files
##### Google files
We can download files from Google Drive. Native Google file types (such as Google Documents, Google Sheets, Google Slides, etc.) need to be exported to some conventional file type. There are reasonable defaults or you can specify this explicitly via `type` or implicitly via the file extension in `path`. For example, if I would like to download the "chicken_sheet" Google Sheet as a `.csv` I could run the following.
```{r}
drive_download("index-chicken-sheet", type = "csv")
```
Alternatively, I could specify type via the `path` parameter.
```{r}
drive_download(
"index-chicken-sheet",
path = "index-chicken-sheet.csv",
overwrite = TRUE
)
```
Notice in the example above, I specified `overwrite = TRUE`, in order to overwrite the local csv file previously saved.
Finally, you could just allow export to the default type. In the case of Google Sheets, this is an Excel workbook:
```{r}
drive_download("index-chicken-sheet")
```
##### All other files
Downloading files that are *not* Google type files is even simpler, i.e. it does not require any conversion or type info.
```{r}
# download it and prove we got it
drive_download("chicken.txt")
readLines("chicken.txt") %>% head()
```
#### Clean up
```{r}
file.remove(c(
"index-chicken-sheet.csv", "index-chicken-sheet.xlsx", "chicken.txt"
))
drive_find("index-chicken") %>% drive_rm()
```
## Privacy
[Privacy policy](https://www.tidyverse.org/google_privacy_policy)