Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reading large datasets #138

Closed
jsmendozap opened this issue Feb 1, 2024 · 6 comments
Closed

Error reading large datasets #138

jsmendozap opened this issue Feb 1, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@jsmendozap
Copy link

When I use arc_read to open a full dataset, I sometimes get some problems (I don't know if maybe it is related to the dataset size).

for example:

This code works

url <- 'https://ags.esri.co/arcgis/rest/services/DatosAbiertos/PNN_2005/MapServer/0/'
data <- arc_read(url)

However, when I try with another dataset like this:

url <- 'https://ags.esri.co/arcgis/rest/services/DatosAbiertos/ELECCIONES_PRESIDENCIALES_2014_MPIO/MapServer/0'
data <- arc_read(url)

I get this error:

Error in do.call(rbind.data.frame, fts_raw[["attributes"]]) : 
  second argument must be a list
@JosiahParry
Copy link
Collaborator

I think the issue is here:

arcgislayers/R/arc-select.R

Lines 221 to 225 in cc48616

has_error <- vapply(all_resps, function(x) inherits(x, "error"), logical(1))
# if (any(has_error)) {
# TODO: determine how to handle errors
# }

The responses from the feature services never return a 400 code so we won't get an error class. We're not actually catching them. We need to handle errors here.

image

@JosiahParry JosiahParry added need guidance Need input or guidance from 3rd party and removed needs review labels Feb 1, 2024
@JosiahParry
Copy link
Collaborator

There are two issues here:

  1. That we're not handling errors appropriately
  2. Why does the feature service return a 500 error? This is outside of our control but should know about and figure out how to resolve. Can we increase the timeout?

@JosiahParry
Copy link
Collaborator

One solutions might be to use smaller batch sizes

@JosiahParry
Copy link
Collaborator

One solutions might be to use smaller batch sizes

Confirmed that this is approach taken by the python API. It uses a backoff-style approach and uses smaller and smaller batch sizes

@JosiahParry JosiahParry added enhancement New feature or request and removed need guidance Need input or guidance from 3rd party labels Feb 9, 2024
@JosiahParry
Copy link
Collaborator

This isn't a fix to the problem, but it is a better solution than what we have presently. Part of the fix is to provide an informative warning.

url <- 'https://ags.esri.co/arcgis/rest/services/DatosAbiertos/ELECCIONES_PRESIDENCIALES_2014_MPIO/MapServer/0'
data <- arcgislayers::arc_read(url)
#> Warning in parse_esri_json(httr2::resp_body_string(x)): Status code: 500
#> Error: Error performing query operation
data
#> data frame with 0 columns and 0 rows

Created on 2024-02-12 with reprex v2.0.2

@JosiahParry
Copy link
Collaborator

Currently have a working solution to this in #146.

library(arcgislayers)
url <- 'https://ags.esri.co/arcgis/rest/services/DatosAbiertos/ELECCIONES_PRESIDENCIALES_2014_MPIO/MapServer/0'
x <- arc_open(url)
res <- arc_select(x, n_max = 25, page_size = 5)

dplyr::glimpse(res)
#> Rows: 25
#> Columns: 13
#> $ OBJECTID         <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
#> $ DPTO_CCDGO       <chr> "94", "91", "95", "86", "95", "95", "97", "97", "97",…
#> $ MPIO_CCDGO       <chr> "887", "540", "001", "749", "015", "200", "161", "511…
#> $ DEPTO            <chr> "GUAINIA", "AMAZONAS", "GUAVIARE", "PUTUMAYO", "GUAVI…
#> $ MUNICIPIO        <chr> "PANA PANA", "PUERTO NARIÑO", "SAN JOSE DEL GUAVIARE"…
#> $ COD_MUNICI       <chr> "887", "540", "001", "749", "015", "200", "161", "511…
#> $ PRECIND_ID       <chr> "94887", "91540", "95001", "86749", "95015", "95200",…
#> $ WINPARTY         <chr> "198", "198", "644", "198", "198", "198", "198", "198…
#> $ TOTBALLOTS       <dbl> 117, 865, 5171, 1257, 1016, 622, 250, 63, 126, 23, 20…
#> $ Absten           <dbl> 65.90909, 37.61810, 35.13398, 43.83534, 32.91294, 21.…
#> $ Shape.STArea..   <dbl> 10177802914, 1573331975, 16495071944, 111122901, 1420…
#> $ Shape.STLength.. <dbl> 792909.82, 210619.33, 1466875.35, 51867.66, 769461.56…
#> $ geometry         <MULTIPOLYGON [m]> MULTIPOLYGON (((-7655890 26..., MULTIPOL…
plot(res$geometry)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants