-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Publish STAC meta data to production and fix data services networking (create NAT gateway) #101
Comments
Are we using the correct extension versions in build stac? Proj extension and raster ext versions: https://github.com/NASA-IMPACT/veda-data-airflow/blob/a47015ba2b327eb5d1f54958246cb6fb5b79ccb1/docker_tasks/build_stac/utils/stac.py#L12 |
Check if rio stac version is correct: https://github.com/NASA-IMPACT/veda-data-airflow/blob/a47015ba2b327eb5d1f54958246cb6fb5b79ccb1/docker_tasks/build_stac/requirements.txt#L7 Currently using 0.7.0 in airflow build stac |
I confirmed that we want to use rio-stac>=0.8.0 to get the correct version of the proj extension. I think we will also have a minor refactor to import the actual versions of the extensions used by rio-stac in airflows build_stac/utils/stac.py as shown in the rio-stac documentation for building multi-asset items. Currently the utils method manually declares the projection version--given that, there may be other slight modifications to how the stac item is created. |
Rio-stac version and corresponding stac extension versions now updated in this pr NASA-IMPACT/veda-data-airflow#125 |
Summary of huddle on April 29, 2024We're currently blocked on this being implemented https://jaas.gsfc.nasa.gov/servicedesk/customer/portal/2/GSD-3143 (creation of a NAT gateway). Proof that there's a networking issue: We also tested the worfklow setting TODO
|
An additional service desk ticket was created on May 3rd, 2024 to update the network ACL rules to allow traffic for ephemeral port range and the ticket is currently in |
Ephemeral port range testing -
|
Now that we are unblocked, here are the notes from a backfill planning session with @botanical @smohiudd @ividito The big backfill planWe plan to use https://staging-stac.delta-backend.com/collections as our source of truth for the collections to publish to the VEDA instances running in MCP (we’ll do some test runs in mcp-test before moving to production). Promote to production working definitionOur target is to promote all the data that are currently staged the UAH hosted staging instance of VEDA to the MCP hosted test and production stacks. At a hight level:
Detailed plan
Git:veda-data necessary changesWe will need to start thinking about a new release for upcoming changes to the ingestion DAGs. We discussed whether we should manage this in a new branch? Should we move the discovery items into the veda-data-airflow project? For now we have decided to proceed with a slight change to the git:veda-data folder structure to accommodate different folders for each stage. As in: we currently have discovery-items configuration for staging data which will move to /staging and a new production/ folder will be created for inputs configured from production data. Restructure folders
Update buckets and prefixes in discovery-itemsCopy veda-data/staging/discovery-items to veda-data/production/discovery-items and
Actions:Backfill track
observability & monitoring in MCP track |
I started a new sheet to this working backfill google spreadsheet and loaded an inventory staging-collections.csv from the staging stac catalog that I generated in a notebook
|
#121 PR to add new directory structure and update prefixes for production |
Potential collections to exclude are:
|
The following discoveries failed in mcp-test:
|
|
For posterity, the {
"id": "AGB_map_2017v0m_COG",
"bbox": [
-18.273529509559307,
-35.054059016911935,
51.86423292864056,
37.73103856358817
],
"type": "Feature",
"links": [],
"assets": {
"cog_default": {
"href": "s3://nasa-maap-data-store/file-staging/nasa-map/nceo-africa-2017/AGB_map_2017v0m_COG.tif",
"type": "image/tiff; application=geotiff; profile=cloud-optimized",
"roles": [
"data",
"layer"
],
"title": "Default COG Layer",
"description": "Cloud optimized default layer to display on map",
"raster:bands": [
{
"scale": 1,
"nodata": "inf",
"offset": 0,
"sampling": "area",
"data_type": "uint16",
"histogram": {
"max": 429,
"min": 0,
"count": 11,
"buckets": [
405348,
44948,
18365,
6377,
3675,
3388,
3785,
9453,
13108,
1186
]
},
"statistics": {
"mean": 37.58407913145342,
"stddev": 81.36678677343947,
"maximum": 429,
"minimum": 0,
"valid_percent": 50.42436439336373
}
}
]
}
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-18.273529509559307,
-35.054059016911935
],
[
51.86423292864056,
-35.054059016911935
],
[
51.86423292864056,
37.73103856358817
],
[
-18.273529509559307,
37.73103856358817
],
[
-18.273529509559307,
-35.054059016911935
]
]
]
},
"collection": "nceo_africa_2017",
"properties": {
"proj:bbox": [
-18.273529509559307,
-35.054059016911935,
51.86423292864056,
37.73103856358817
],
"proj:epsg": 4326,
"proj:shape": [
81024,
78077
],
"end_datetime": "2017-12-31T23:59:59+00:00",
"proj:geometry": {
"type": "Polygon",
"coordinates": [
[
[
-18.273529509559307,
-35.054059016911935
],
[
51.86423292864056,
-35.054059016911935
],
[
51.86423292864056,
37.73103856358817
],
[
-18.273529509559307,
37.73103856358817
],
[
-18.273529509559307,
-35.054059016911935
]
]
]
},
"proj:transform": [
0.0008983152841195214,
0,
-18.273529509559307,
0,
-0.0008983152841195214,
37.73103856358817,
0,
0,
1
],
"start_datetime": "2017-01-01T00:00:00+00:00",
"datetime": null
},
"stac_version": "1.0.0",
"stac_extensions": [
"https://stac-extensions.github.io/projection/v1.0.0/schema.json",
"https://stac-extensions.github.io/raster/v1.1.0/schema.json"
]
} |
Here's a small first draft of an audit of the collections in veda-config datasets and the
collection_id='CMIP585-winter-median-pr' items_matched=0 src_items_matched=4 src_match=False! collection_id='MO_NPP_npp_vgpm' items_matched=0 src_items_matched=12 src_match=False! collection_id='bangladesh-landcover-2001-2020' items_matched=0 src_items_matched=2 src_match=False! collection_id='campfire-lst-day-diff' items_matched=0 src_items_matched=1 src_match=False! collection_id='campfire-nlcd' items_matched=1 src_items_matched=2 src_match=False! collection_id='fldas-soil-moisture-anomalies' items_matched=0 src_items_matched=499 src_match=False! collection_id='geoglam' items_matched=46 src_items_matched=47 src_match=False! collection_id='hls-swir-falsecolor-composite' items_matched=0 src_items_matched=2 src_match=False! collection_id='houston-lst-diff' items_matched=0 src_items_matched=1 src_match=False! collection_id='houston-urbanization' items_matched=0 src_items_matched=1 src_match=False! collection_id='lis-global-da-evap' items_matched=7062 src_items_matched=6849 src_match=False! collection_id='lis-global-da-gpp' items_matched=7062 src_items_matched=6841 src_match=False! collection_id='lis-global-da-gpp-trend' items_matched=0 src_items_matched=3 src_match=False! collection_id='lis-global-da-gws' items_matched=2779 src_items_matched=6844 src_match=False! collection_id='lis-global-da-streamflow' items_matched=0 src_items_matched=5998 src_match=False! collection_id='lis-global-da-totalprecip' items_matched=6605 src_items_matched=7364 src_match=False! collection_id='lis-global-da-tws' items_matched=7062 src_items_matched=6768 src_match=False! collection_id='lis-global-da-tws-trend' items_matched=2 src_items_matched=3 src_match=False! collection_id='lis-tws-anomaly' items_matched=6698 src_items_matched=7031 src_match=False! collection_id='lis-tws-trend' items_matched=0 src_items_matched=1 src_match=False! collection_id='mtbs-burn-severity' items_matched=1 src_items_matched=5 src_match=False! collection_id='nceo_africa_2017' items_matched=0 src_items_matched=1 src_match=False! collection_id='nightlights-hd-1band' items_matched=7 src_items_matched=6 src_match=False! collection_id='nightlights-hd-monthly' items_matched=0 src_items_matched=1134 src_match=False! collection_id='no2-monthly' items_matched=0 src_items_matched=93 src_match=False! collection_id='no2-monthly-diff' items_matched=1 src_items_matched=105 src_match=False! collection_id='snow-projections-diff-585' items_matched=0 src_items_matched=40 src_match=False! collection_id='snow-projections-median-245' items_matched=0 src_items_matched=40 src_match=False! collection_id='snow-projections-median-585' items_matched=0 src_items_matched=40 src_match=False! collection_id='social-vulnerability-index-household' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-household-nopop' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-housing' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-housing-nopop' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-minority' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-minority-nopop' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-overall' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-overall-nopop' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-socioeconomic' items_matched=0 src_items_matched=5 src_match=False! collection_id='social-vulnerability-index-socioeconomic-nopop' items_matched=0 src_items_matched=5 src_match=False! collection_id='sport-lis-vsm0_100cm-percentile' items_matched=0 src_items_matched=2 src_match=False!
|
#132 PR to restructure |
Would it be easy enough to rename this collection here before / as we publish the production catalog? |
What
All collections currently in
staging
should be published to theproduction
instance.provider
andrender
meta data that is included in theveda-data
repo.veda-data-store-production
bucketsummaries
should be included in all collectionsPI Objective
Objective 4: Publish production data24.3 Objective 2: Publish STAC metadata into Production VEDA
Acceptance Criteria
production
including provider and renders meta data and referencing theveda-data-store-production
bucketTasks
The text was updated successfully, but these errors were encountered: