Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish Hazard Local Effects Input Layers as OpenData #31

Open
2 of 4 tasks
p-a-s-c-a-l opened this issue Mar 30, 2020 · 14 comments
Open
2 of 4 tasks

Publish Hazard Local Effects Input Layers as OpenData #31

p-a-s-c-a-l opened this issue Mar 30, 2020 · 14 comments

Comments

@p-a-s-c-a-l
Copy link
Member

p-a-s-c-a-l commented Mar 30, 2020

Publish Hazard Local Effects Input Layers as OpenData on Zenodo and update the CKAN Catalogue datasets.

@p-a-s-c-a-l
Copy link
Member Author

ATM just 20 of 544 cities have been calculated. Can you estimate how long it will take to calculate the remaining cities? @negroscuro

For the DMP, perhaps we should upload what we have now to Zenodo and make an update when all cities are available. I've downloaded built_open_spaces and medium_urban_fabric Shapefile from here, but the zip files are only a few KB in size, is this realistic?

BTW, can you please check if the list and description of local effects datasets in CKAN is still complete?

@negroscuro
Copy link

negroscuro commented May 6, 2020

There is a discussion regarding data generation for cities at:
clarity-h2020/data-package#59

I am afraid I cannot estimate any feasible / reasonable time for that.
Before adding new data ESM20 to vegetation layer it could take around 5 weeks, now I really do not know... but much more than that in theory. I have been warning everyone about this more than a month ago. Indeed current 20 cities are not updated with new vegetation(ESM20) data.

@p-a-s-c-a-l
Copy link
Member Author

p-a-s-c-a-l commented May 6, 2020

O.K. And what about my 2nd question:

I've downloaded built_open_spaces and medium_urban_fabric Shapefile from here, but the zip files are only a few KB in size, is this realistic?

@negroscuro
Copy link

negroscuro commented May 6, 2020

Sure, sorry.

Regarding data size realism, it is not, I just downloaded whole Europe Agricultural areas:
http://services.clarity-h2020.eu:8080/geoserver/clarity/ows?service=WFS&version=1.0.0&request=GetFeature&typeName=clarity%3Aagricultural_areas&outputFormat=SHAPE-ZIP
It takes 235Mb as compressed file. See that maxFeatures=50 is limiting what you are getting from the server, that parameter has to be removed in order to get all data in a specific layer request...

I am trying to find how to download a compressed geojson file which can be less heavy than a shapefile in order to upload to Zenodo in an easier way...

Regarding CKAN data is still complete, I updated it a couple of months ago to add latest added layer which is sports.

@negroscuro
Copy link

negroscuro commented May 6, 2020

I manage to download json by using curl from ubuntu console:

curl -u admin:XXXXXXXX -XGET "http://services.clarity-h2020.eu:8080/geoserver/clarity/ows?service=WFS&version=1.1.0&request=GetFeature&typeName=clarity:agricultural_areas&maxFeatures=50&outputFormat=application/json" > agricultural_areas.json

It only takes 176Kb.
Do you want me to provide current layer contents by exporting to such format every layer in the geoserver for local effects and have something to upload to Zenodo?

@negroscuro
Copy link

negroscuro commented May 6, 2020

I checked a couple of local effects datasets in order to see if WFS links are set to download a shapefile zip and they are. Even without the maxFeatures=50 limitation, so I would say data is correctly referenced from CKAN.
Of course URL's have to be updated once migration of Geoserver is done.

The bad point is that trying to make such link to be a json download ... I do not know if that would work since in my cas the webBrowser tries to open it and it produced a memory leak in the web Broser... that is why I tested with CURL command.

@negroscuro
Copy link

Last test with Reggio_di_Calabria a city with just 2061 cells uses to last around 25minutes but with ESM20 even after the CPU parallel query enhancement is taking around 32hours so I stopped it...

@negroscuro negroscuro removed their assignment May 8, 2020
@p-a-s-c-a-l p-a-s-c-a-l moved this from To do to In progress in T7.3 Data Management May 12, 2020
@p-a-s-c-a-l
Copy link
Member Author

TODO @p-a-s-c-a-l : Deposit data in Zenodo

I noticed there is a dataset that is missing in CKAN (I just created it) but I need the data reference to be set in Zenodo, could you please handle that?
https://ckan.myclimateservice.eu/dataset/land-use-grid

Mortality is also missing Zenodo:
https://ckan.myclimateservice.eu/dataset/mortality

Cities as well is missing:
https://ckan.myclimateservice.eu/dataset/cities
Basins:
https://ckan.myclimateservice.eu/dataset/basins
And Streams:
https://ckan.myclimateservice.eu/dataset/streams

@p-a-s-c-a-l
Copy link
Member Author

@DanielRodera

When trying to download e.g. roads.shp.zip, GeoServer responds with

502 Bad Gateway
nginx/1.17.6

@DanielRodera
Copy link

Hi @p-a-s-c-a-l, as I see you are trying to download the entire layer which contains all the roads in Europe. The Geoserver is not able to compress all the data for this layer in a reasonable time, that's why is throwing the error. If you try adding the "&maxFeatures=" to the request, will work.

@p-a-s-c-a-l
Copy link
Member Author

Hi @p-a-s-c-a-l, as I see you are trying to download the entire layer which contains all the roads in Europe. The Geoserver is not able to compress all the data for this layer in a reasonable time, that's why is throwing the error. If you try adding the "&maxFeatures=" to the request, will work.

Yes, but that's what I need, the complete data.

@maesbri
Copy link

maesbri commented Aug 12, 2020

Hi @p-a-s-c-a-l, as I see you are trying to download the entire layer which contains all the roads in Europe. The Geoserver is not able to compress all the data for this layer in a reasonable time, that's why is throwing the error. If you try adding the "&maxFeatures=" to the request, will work.

Yes, but that's what I need, the complete data.

I would then propose to export directly from the database that layer (and probably others) into a compressed zip and publish it in a web server (or maybe Zenodo) and use that as link for the catalogue. Geoserver (and more specifically, WFS and WCS services) never were meant for downloading large amounts of data as it is not an efficient mean (for that purpose is better to serve it via ftp or http as a zip).

@p-a-s-c-a-l
Copy link
Member Author

Thanks @maesbri We should do that as soon as all cities have been calculated.

According to H2020 open data obligations we have assure long term preservation of the data, so the best option is to update the existing local effects datasets available in Zenodo.

@p-a-s-c-a-l
Copy link
Member Author

@DanielRodera Is the HC-LE calculation complete? Is the data available somewhere for downloading?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
T7.3 Data Management
  
In progress
Development

No branches or pull requests

5 participants