-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: replace {crul} with {httr2} #157
base: main
Are you sure you want to change the base?
Conversation
also need to use https://httr2.r-lib.org/reference/req_throttle.html |
Yeah, watch out for https://github.com/ropensci/opencage/blob/main/R/zzz.R and Lines 122 to 126 in 9b57eef
Though I am sure you've seen that. |
R/oc_config.R
Outdated
@@ -91,7 +91,6 @@ | |||
#' @export | |||
oc_config <- | |||
function(key = Sys.getenv("OPENCAGE_KEY"), | |||
rate_sec = getOption("oc_rate_sec", default = 1L), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe needs actual deprecation 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather leave it in here (so all possible opencage session level options can be set with oc_config()
) and just set options("oc_rate_sec" = rate_sec)
, like no_record
and show_key
.
Maybe check whether rate_sec
is numeric beforehand.
), | ||
endpoint = "json" | ||
) | ||
), | ||
"HttpResponse" | ||
"httr2_response" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests are actually failing with "HTTP 400 Bad Request". I'm not sure why as the same oc_forward()
query works. I see the first test in this file, that doesn't fail, uses a key for always getting 200. I'm a bit lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All queries/tests in test-oc_get.R
return HTTP 400 errors, also on main, because the query parameter placename
ought to be q
here 🙈. Well, the first one passes, because it uses the key_200
which always returns a HTTP 200 response, no matter what nonsense you throw at the API.
I don't know whether this has always been the case or whether there has been a related change in the OpenCage API. Regardless, it never surfaced, because all queries only test whether a HttpResponse
object is returned by oc_get()
, and the status wasn't checked until later. But the status check is now as part of oc_get()
so, the error now surfaces here already.
I am not sure about the usefulness of the Namibia and vector countrycode tests here. I think that these should be tested with oc_forward()
if they aren't already?
Here, I am more interested in whether memoisation and rate limiting works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted in another comment, I wouldn't test memoisation nor rate limiting as we outsource those.
What we should test is as you noted, whether the configuration function can configure the rate. BUT since in req_throttle()
I call getOption("oc_rate_sec", default = 1L)
, it only means retrieving the option before and after calling the configuration function.
Would you agree?
@@ -5,8 +5,8 @@ test_that("oc_clear_cache clears cache", { | |||
# until a memoise >v.1.1 is released, we need to run oc_get_memoise() twice to | |||
# have it really cache results | |||
# https://github.com/ropensci/opencage/pull/87#issuecomment-573573183 | |||
replicate(2, oc_get_memoise("https://httpbin.org/get")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure these edits were the right ones. However now one cannot "get" stuff from an URL that's not the OpenCage API, with the refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure either whether this actually tests what we want to.
Could we build oc_get()
in such a way that it accepts a httr2 request object and just performs the request (i.e. just wrapper around httr2::req_perform()
)? Then we could just build a simple query request here (I'd rather not hit the OpenCage API for this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am now wondering why we are testing memoization at all, given it's "outsourced" to {memoise}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not convinced we need to test oc_clear_cache()
as it only wraps memoise that we trust?
By the way here's another caching method: https://deploy-preview-537--ropensci.netlify.app/blog/2023/04/21/ropensci-news-digest-april-2023/#caching-the-results-of-functions-of-your-r-package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, quite some comments. 🙈 I hope they are at least comprehensible.
R/oc_config.R
Outdated
@@ -91,7 +91,6 @@ | |||
#' @export | |||
oc_config <- | |||
function(key = Sys.getenv("OPENCAGE_KEY"), | |||
rate_sec = getOption("oc_rate_sec", default = 1L), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather leave it in here (so all possible opencage session level options can be set with oc_config()
) and just set options("oc_rate_sec" = rate_sec)
, like no_record
and show_key
.
Maybe check whether rate_sec
is numeric beforehand.
), | ||
endpoint = "json" | ||
) | ||
), | ||
"HttpResponse" | ||
"httr2_response" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All queries/tests in test-oc_get.R
return HTTP 400 errors, also on main, because the query parameter placename
ought to be q
here 🙈. Well, the first one passes, because it uses the key_200
which always returns a HTTP 200 response, no matter what nonsense you throw at the API.
I don't know whether this has always been the case or whether there has been a related change in the OpenCage API. Regardless, it never surfaced, because all queries only test whether a HttpResponse
object is returned by oc_get()
, and the status wasn't checked until later. But the status check is now as part of oc_get()
so, the error now surfaces here already.
I am not sure about the usefulness of the Namibia and vector countrycode tests here. I think that these should be tested with oc_forward()
if they aren't already?
Here, I am more interested in whether memoisation and rate limiting works as expected.
@@ -48,25 +48,6 @@ test_that("oc_config throws error with faulty OpenCage key", { | |||
) | |||
}) | |||
|
|||
# test rate_sec argument -------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a little test whether oc_config()
can update the oc_rate_limit
option?
DESCRIPTION
Outdated
dplyr (>= 0.7.4), | ||
httr2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is v0.1.1 sufficient or do we use a feature from a more recent version?
https://httr2.r-lib.org/news/index.html
(Didn't realise that httr2 is so recent 😮)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a dependency on >= 0.2.0 because of r-lib/httr2#101
@@ -130,7 +130,7 @@ oc_process <- | |||
} | |||
|
|||
# build url | |||
oc_url <- oc_build_url( | |||
oc_url_parts <- oc_build_url( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current naming confused me a little, I am afraid.
I think there are two to three steps until the request is made
1a. format the query parameters, i.e. turn booleans into integers, delete unused parameters, concatenate bounds coordinates, etc. This is now done in oc_build_url()
except this does not yet build the url, so maybe oc_format_query_parameters()
and the resulting list is query_parameters
? I would move the endpoint
as a separate input to the next step/function.
1b. build the httr2 request object (or rather just its url
part). Basically what build_query_with_req()
does now (maybe just name it build_request()
?). The inputs are the query_parameters
and endpoint
, the output is query_req
.
2. Make the request with oc_get()
. Pretty much like it is now, but with request
or similar as an argument name. (Note to self: This has to be a separate step, so we can return the url_only
before).
🤔 À propos naming: Maybe I overdid it with the oc_
prefix for internal functions... 😉
build_query_with_req <- function(oc_url_parts) { | ||
initial_req <- httr2::request("https://api.opencagedata.com") | ||
|
||
req_with_url <- if (!is.null(oc_url_parts[["path"]])) { | ||
httr2::req_url_path_append(initial_req, oc_url_parts[["path"]]) | ||
} else { | ||
initial_req | ||
} | ||
|
||
query_req <- if (!is.null(oc_url_parts[["query"]])) { | ||
httr2::req_url_query(req_with_url, !!!oc_url_parts[["query"]]) | ||
} else { | ||
req_with_url | ||
} | ||
|
||
query_req |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we only use this within oc_process()
and tie it to the OpenCage API domain, I think the following should be just as safe. I.e. I cannot think of a scenario where we would make queries to api.opencagedata.com without an endpoint
or query_parameters
.
build_query_with_req <- function(oc_url_parts) { | |
initial_req <- httr2::request("https://api.opencagedata.com") | |
req_with_url <- if (!is.null(oc_url_parts[["path"]])) { | |
httr2::req_url_path_append(initial_req, oc_url_parts[["path"]]) | |
} else { | |
initial_req | |
} | |
query_req <- if (!is.null(oc_url_parts[["query"]])) { | |
httr2::req_url_query(req_with_url, !!!oc_url_parts[["query"]]) | |
} else { | |
req_with_url | |
} | |
query_req | |
build_request <- function(endpoint, query_parameters) { | |
httr2::request("https://api.opencagedata.com") %>% | |
httr2::req_url_path_append(endpoint) %>% | |
httr2::req_url_query(req_with_url, !!!query_parameters) |
But if we want to keep it generic (e.g. for the oc_get_memoise()
tests), we could instead do
build_query_with_req <- function(oc_url_parts) { | |
initial_req <- httr2::request("https://api.opencagedata.com") | |
req_with_url <- if (!is.null(oc_url_parts[["path"]])) { | |
httr2::req_url_path_append(initial_req, oc_url_parts[["path"]]) | |
} else { | |
initial_req | |
} | |
query_req <- if (!is.null(oc_url_parts[["query"]])) { | |
httr2::req_url_query(req_with_url, !!!oc_url_parts[["query"]]) | |
} else { | |
req_with_url | |
} | |
query_req | |
build_request <- function(base_url = "https://api.opencagedata.com", endpoint, query_parameters) { | |
query_req <- httr2::request(base_url) | |
query_req <- if (!is.null(endpoint)) { | |
httr2::req_url_path_append(query_req , endpoint) | |
} | |
query_req <- if (!is.null(query_parameters)) { | |
httr2::req_url_query(query_req, !!!query_parameters) | |
} | |
query_req |
jsonlite (>= 1.5), | ||
lifecycle, | ||
magrittr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might as well take the dependency on magrittr v2.0 then?
I.e. does it make sense to more generally bump all versions roughly to the one before the most recent necessary dependency (i.e. httr2) came out? Or is that a mistake? I wouldn't know what more ancient versions really mean practically?
magrittr, | |
magrittr (>= 2.0.0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a specific reason for depending on that magrittr version?
@@ -5,8 +5,8 @@ test_that("oc_clear_cache clears cache", { | |||
# until a memoise >v.1.1 is released, we need to run oc_get_memoise() twice to | |||
# have it really cache results | |||
# https://github.com/ropensci/opencage/pull/87#issuecomment-573573183 | |||
replicate(2, oc_get_memoise("https://httpbin.org/get")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure either whether this actually tests what we want to.
Could we build oc_get()
in such a way that it accepts a httr2 request object and just performs the request (i.e. just wrapper around httr2::req_perform()
)? Then we could just build a simple query request here (I'd rather not hit the OpenCage API for this).
@@ -5,8 +5,8 @@ test_that("oc_clear_cache clears cache", { | |||
# until a memoise >v.1.1 is released, we need to run oc_get_memoise() twice to | |||
# have it really cache results | |||
# https://github.com/ropensci/opencage/pull/87#issuecomment-573573183 | |||
replicate(2, oc_get_memoise("https://httpbin.org/get")) | |||
expect_true(memoise::has_cache(oc_get_memoise)("https://httpbin.org/get")) | |||
replicate(2, oc_get_memoise()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replicate(2, oc_get_memoise()) | |
oc_get_memoise() |
Co-authored-by: Daniel Possenriede <possenriede@gmail.com>
Oh: https://httr2.r-lib.org/reference/req_cache.html (not sure how to clear that cache) |
Fix #156