Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return S3 data links by default when in region #318

Merged
merged 4 commits into from
Oct 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions earthaccess/results.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ def data_links(
s3_links = self._filter_related_links("GET DATA VIA DIRECT ACCESS")
if in_region:
# we are in us-west-2
if self.cloud_hosted and access is None:
if self.cloud_hosted and access in (None, "direct"):
# this is a cloud collection and we didn't specify the access type
# default to S3 links
if len(s3_links) == 0 and len(https_links) > 0:
Expand All @@ -325,7 +325,6 @@ def data_links(
else:
# we are not in us-west-2, even cloud collections have HTTPS links
return https_links
return https_links
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just cosmetic (this line would never be called, so I decided to remove it)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I tend to go the other way and drop the else statement, but six-of-one...


def dataviz_links(self) -> List[str]:
"""
Expand Down
21 changes: 21 additions & 0 deletions tests/unit/test_results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import earthaccess


def test_data_links():
granules = earthaccess.search_data(
short_name="SEA_SURFACE_HEIGHT_ALT_GRIDS_L4_2SATS_5DAY_6THDEG_V_JPL2205",
temporal=("2020", "2022"),
count=1,
)
g = granules[0]
# `access` specified
assert g.data_links(access="direct")[0].startswith("s3://")
assert g.data_links(access="external")[0].startswith("https://")
# `in_region` specified
assert g.data_links(in_region=True)[0].startswith("s3://")
assert g.data_links(in_region=False)[0].startswith("https://")
# When `access` and `in_region` are both specified, `access` takes priority
assert g.data_links(access="direct", in_region=True)[0].startswith("s3://")
assert g.data_links(access="direct", in_region=False)[0].startswith("s3://")
assert g.data_links(access="external", in_region=True)[0].startswith("https://")
assert g.data_links(access="external", in_region=False)[0].startswith("https://")
Comment on lines +12 to +21
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the intended behavior we want, but let me know if I'm missing something.

As a side note, I'm not sure why we have separate access and in_region kwargs for determining if we want to use s3 or https urls. Is one kwarg sufficient?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, I'm not sure why we have separate access and in_region kwargs for determining if we want to use s3 or https urls. Is one kwarg sufficient?

I can't answer the question directly, but these keywords also feel unintuitive to me. What about access="s3"? To me, "direct" and "external" don't mean anything without more context, but "s3" and "https" do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that two kwargs is clunky and direct/external aren't the most descriptive names, and we could likely handle it with a single kwarg.

that said, since they are the current interface, we should probably open an issue for possibly refactoring it and not block this PR.

Loading