Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[12/n][dagster-airbyte] Implement sync_and_poll method in AirbyteCloudClient #26431

Merged
merged 6 commits into from
Dec 26, 2024

Conversation

maximearmstrong
Copy link
Contributor

@maximearmstrong maximearmstrong commented Dec 12, 2024

Summary & Motivation

Implement full sync and poll process in AirbyteCloudClient. This will be used in a subsequent PR in AirbyteCloudWorkspace.sync_and_poll to materialize Airbyte assets.

How I Tested These Changes

Additional tests with BK

@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 01c67e0 to 6bcbaf3 Compare December 13, 2024 00:03
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 4b9b2c1 to 6585306 Compare December 13, 2024 00:04
@maximearmstrong maximearmstrong changed the title [dagster-airbyte] Implement sync_and_poll method in AirbyteCloudClient [12/n][dagster-airbyte] Implement sync_and_poll method in AirbyteCloudClient Dec 16, 2024
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 6bcbaf3 to 5b2f698 Compare December 17, 2024 23:13
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 6585306 to e8d86fc Compare December 17, 2024 23:13
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 5b2f698 to 48f6c5e Compare December 18, 2024 00:48
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from e8d86fc to 15f5b2b Compare December 18, 2024 00:48


@deprecated(breaking_version="1.10", additional_warn_text="Use `AirbyteJobStatusType` instead.")
class AirbyteState:
Copy link
Contributor Author

@maximearmstrong maximearmstrong Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum is renamed to be AirbyteJobStatusType, which reflects Airbyte's ontology and our naming convention in other integrations.

Kept for backcompat and to be removed in 1.10 because it was exposed in __init__.py

Copy link
Contributor

@dpeng817 dpeng817 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 1.10 the right thing to say here; since dagster-airbyte is on a pre 1.0 release ?

@@ -27,7 +27,9 @@
TEST_STREAM_NAME = "test_stream"
TEST_SELECTED = True
TEST_JSON_SCHEMA = {}
TEST_JOB_ID = "3fa85f64-5717-4562-b3fc-2c963f66afa6"
TEST_JOB_ID = 12345
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes an error in Airbyte's example in the API documentation here. The json example uses a UUID as the Job ID, but the documentation mentions that the ID is an int.

@@ -50,7 +50,12 @@ def test_trigger_connection_fail() -> None:
@responses.activate
@pytest.mark.parametrize(
"state",
[AirbyteState.SUCCEEDED, AirbyteState.CANCELLED, AirbyteState.ERROR, "unrecognized"],
[
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing is changed in the legacy tests except the name of the enum

@maximearmstrong maximearmstrong self-assigned this Dec 18, 2024
@maximearmstrong maximearmstrong marked this pull request as ready for review December 18, 2024 01:21
)

time.sleep(poll_interval)
poll_job_details = self.get_job_details(job.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; why make the caller initialize the job from the details. Feels like we should just return it directly

Comment on lines 213 to 222
if status == TEST_UNRECOGNIZED_AIRBYTE_JOB_STATUS_TYPE:
base_api_mocks.add(
method=responses.DELETE,
url=test_job_api_url,
status=200,
json=get_job_details_sample(status=AirbyteJobStatusType.CANCELLED),
)
Copy link
Contributor

@dpeng817 dpeng817 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused by this - why is this necessary?

Copy link
Contributor Author

@maximearmstrong maximearmstrong Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have another test dedicated to cancel_on_termination, so let's remove this part of the test when testing statuses - responses.RequestsMock yells if a mocked request is not called in a test, so it must be added only when the job is cancelled.

Comment on lines 221 to 228
if status in [AirbyteJobStatusType.ERROR, AirbyteJobStatusType.FAILED]:
with pytest.raises(Failure, match="Job failed"):
client.sync_and_poll(connection_id=TEST_CONNECTION_ID, poll_interval=0)

elif status == AirbyteJobStatusType.CANCELLED:
with pytest.raises(Failure, match="Job was cancelled"):
client.sync_and_poll(connection_id=TEST_CONNECTION_ID, poll_interval=0)

elif status == TEST_UNRECOGNIZED_AIRBYTE_JOB_STATUS_TYPE:
with pytest.raises(Failure, match="unexpected state"):
client.sync_and_poll(connection_id=TEST_CONNECTION_ID, poll_interval=0)
assert_rest_api_call(
call=base_api_mocks.calls[-1], endpoint=test_job_endpoint, method=responses.DELETE
)

else:
result = client.sync_and_poll(connection_id=TEST_CONNECTION_ID, poll_interval=0)
assert result == AirbyteOutput(
job_details=get_job_details_sample(AirbyteJobStatusType.SUCCEEDED),
connection_details=SAMPLE_CONNECTION_DETAILS,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; this could all be part of the parameterization. I actually had to do something similar recently, and I found an optional_pytest_raise to be really useful here: https://github.com/dagster-io/dagster/pull/26545/files#diff-bb4b77564796719ee9a5eaf83ecd67f54f4f16b8b7724f9348d1a38dac76800d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 5df711b

Comment on lines 375 to 383
if cancel_on_termination:
assert_rest_api_call(
call=base_api_mocks.calls[-1], endpoint=test_job_endpoint, method=responses.DELETE
)
else:
# If we don't cancel on termination, the last call will be a call to fetch the job details
assert_rest_api_call(
call=base_api_mocks.calls[-1], endpoint=test_job_endpoint, method=responses.GET
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really don't love these complex ternaries in test body IMO. Just really hard to read, I think this should just be a part of the parameterization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good call - updated in 5df711b

Copy link
Contributor

@dpeng817 dpeng817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback / comments but nothing review blocking

@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 48f6c5e to 438b2bc Compare December 18, 2024 22:42
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 1b64ddb to 860d858 Compare December 18, 2024 22:42
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 438b2bc to 2255aff Compare December 19, 2024 01:19
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 860d858 to c7fbd62 Compare December 19, 2024 01:19
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 2255aff to e4f7ec5 Compare December 19, 2024 03:29
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from c7fbd62 to 5caf85f Compare December 19, 2024 03:29
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from e4f7ec5 to 0f39290 Compare December 19, 2024 04:01
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 5caf85f to f888a06 Compare December 19, 2024 04:01
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 0f39290 to 159a035 Compare December 19, 2024 15:50
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from f888a06 to 90848ca Compare December 19, 2024 15:50
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 159a035 to 1ae39d0 Compare December 26, 2024 15:30
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 90848ca to 94bb6cc Compare December 26, 2024 15:30
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-11 branch from 1ae39d0 to 797ff67 Compare December 26, 2024 15:44
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 94bb6cc to 7009a22 Compare December 26, 2024 15:45
Base automatically changed from maxime/rework-airbyte-cloud-11 to master December 26, 2024 16:07
@maximearmstrong maximearmstrong force-pushed the maxime/rework-airbyte-cloud-12 branch from 7009a22 to fa56484 Compare December 26, 2024 16:08
@maximearmstrong maximearmstrong merged commit 2384e5b into master Dec 26, 2024
1 check passed
@maximearmstrong maximearmstrong deleted the maxime/rework-airbyte-cloud-12 branch December 26, 2024 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants