Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS handling for nwis module #170

Open
ehinman opened this issue Oct 25, 2024 · 1 comment
Open

CRS handling for nwis module #170

ehinman opened this issue Oct 25, 2024 · 1 comment

Comments

@ehinman
Copy link
Collaborator

ehinman commented Oct 25, 2024

It was noted by @aaraney, @lstanish-usgs, Mike Mahoney, @thodson-usgs and others that hard-coding a CRS for NWIS sites is not ideal. While NAD83 is the most common CRS projection, it is not the exclusive one used to document the location of all gages, example: https://waterservices.usgs.gov/nwis/site/?format=rdb&sites=483554104034801&siteOutput=expanded. At the very least, we should warn users of this inconsistency, at a more comprehensive level, we could use the provided datum for each site from the downloaded dataset to set the projection before converting to a unified projection like WGS84.

@aaraney
Copy link
Contributor

aaraney commented Oct 25, 2024

While NAD83 is the most common CRS projection, it is not the exclusive one used to document the location of all gages, example: https://waterservices.usgs.gov/nwis/site/?format=rdb&sites=483554104034801&siteOutput=expanded.

@ehinman, I think we all might have been a little overhasty.

TL;DR - dataretrieval.nwis uses dec_lat_va and dec_long_va fields to construct a GeoDataFrame's geometry column. dec_coord_datum_cd is the crs of dec_lat_va and dec_long_va. dec_coord_datum_cd is only ever empty or NAD83.

Full story:

The waterservices api reports up to two pairs of latitude and longitude. lat_va, long_va in units of degrees minutes seconds (DMS) and dec_lat_va, dec_long_va in units of decimal degrees. Likewise, the associated crs for the lat long pair, if known, is given in the coord_datum_cd for DMS and dec_coord_datum_cd for the decimal degrees coordinates.

#  lat_va          -- DMS latitude
#  long_va         -- DMS longitude
#  dec_lat_va      -- Decimal latitude
#  dec_long_va     -- Decimal longitude
#  ...
#  coord_datum_cd  -- Latitude-longitude datum            <---
#  dec_coord_datum_cd -- Decimal Latitude-longitude datum <---

dataretrieval-python's nwis module uses the decimal degree pair to construct the geometry column. See here:

geoms = gpd.points_from_xy(df.dec_long_va.values, df.dec_lat_va.values)

After doing a little digging, it seems that only the coord_datum_cd changes (the one dataretrieval.nwis is not using). dec_coord_datum_cd is only ever empty or NAD83. For example, https://waterservices.usgs.gov/nwis/site/?format=rdb&sites=483554104034801&siteOutput=expanded does have a coord_datum_cd of WGS84 however it does not have a dec_coord_datum_cd; It is empty.

To check this I wrote up the following small script:

from pprint import pprint
import numpy as np
from dataretrieval import nwis

# State or Territory list from:
# https://waterservices.usgs.gov/test-tools/?service=site&siteType=&statTypeCd=all&major-filters=sites&format=rdb&date-type=type-none&statReportType=daily&statYearType=calendar&missingData=off&siteStatus=all&siteNameMatchOperator=start
cds = ['al', 'ak', 'aq', 'az', 'ar', '96', 'ca', 'co', 'ct', 'de', 'dc', '62', 'fl', 'ga', 'gu', 'hi', 'id', 'il', 'in', 'ia', '67', 'ks', 'ky', 'la', 'me', 'md', 'ma', 'mi', '71', 'mn', 'ms', 'mo', 'mt', 'ne', 'nv', 'nh', 'nj', 'nm', 'ny', 'nc', 'nd', 'mp', 'oh', 'ok', 'or', 'pa', 'pr', 'ri', '73', 'sc', 'sd', '74', 'tn', 'tx', '75', '76', '77', 'ut', 'vt', 'vi', 'va', '79', 'wa', 'wv', 'wi', 'wy']

dec_datums = {}
for cd in cds:
    try:
        df, _ = nwis.get_info(stateCd=cd)
    except BaseException as e:
        print(f"{cd} failed with {e}")
        continue
    dec_datums[cd] = df["dec_coord_datum_cd"].unique().tolist()

for datums in dec_datums.values():
    for datum in datums:
        assert datum in ("NAD83", np.nan)

pprint(dec_datums)
output
62 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=62&siteOutput=Expanded&format=rdb
67 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=67&siteOutput=Expanded&format=rdb
71 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=71&siteOutput=Expanded&format=rdb
73 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=73&siteOutput=Expanded&format=rdb
74 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=74&siteOutput=Expanded&format=rdb
75 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=75&siteOutput=Expanded&format=rdb
76 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=76&siteOutput=Expanded&format=rdb
77 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=77&siteOutput=Expanded&format=rdb
79 failed with Page Not Found Error. May be the result of an empty query. URL: https://waterservices.usgs.gov/nwis/site?stateCd=79&siteOutput=Expanded&format=rdb

{'96': ['NAD83', nan],
 'ak': ['NAD83', nan],
 'al': ['NAD83', nan],
 'aq': ['NAD83'],
 'ar': ['NAD83', nan],
 'az': ['NAD83', nan],
 'ca': ['NAD83', nan],
 'co': ['NAD83', nan],
 'ct': ['NAD83', nan],
 'dc': ['NAD83', nan],
 'de': ['NAD83', nan],
 'fl': ['NAD83', nan],
 'ga': ['NAD83', nan],
 'gu': ['NAD83'],
 'hi': ['NAD83', nan],
 'ia': ['NAD83', nan],
 'id': ['NAD83', nan],
 'il': ['NAD83', nan],
 'in': ['NAD83', nan],
 'ks': ['NAD83', nan],
 'ky': ['NAD83', nan],
 'la': ['NAD83', nan],
 'ma': ['NAD83', nan],
 'md': ['NAD83', nan],
 'me': ['NAD83', nan],
 'mi': ['NAD83', nan],
 'mn': ['NAD83', nan],
 'mo': ['NAD83', nan],
 'mp': ['NAD83'],
 'ms': ['NAD83', nan],
 'mt': ['NAD83', nan],
 'nc': ['NAD83', nan],
 'nd': ['NAD83', nan],
 'ne': ['NAD83', nan],
 'nh': ['NAD83', nan],
 'nj': ['NAD83', nan],
 'nm': ['NAD83', nan],
 'nv': ['NAD83', nan],
 'ny': ['NAD83', nan],
 'oh': ['NAD83', nan],
 'ok': ['NAD83', nan],
 'or': ['NAD83', nan],
 'pa': ['NAD83', nan],
 'pr': ['NAD83', nan],
 'ri': ['NAD83', nan],
 'sc': ['NAD83', nan],
 'sd': ['NAD83', nan],
 'tn': ['NAD83', nan],
 'tx': ['NAD83', nan],
 'ut': ['NAD83', nan],
 'va': ['NAD83', nan],
 'vi': ['NAD83'],
 'vt': ['NAD83', nan],
 'wa': ['NAD83', nan],
 'wi': ['NAD83', nan],
 'wv': ['NAD83', nan],
 'wy': ['NAD83', nan]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants