Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source: background file downloads for FileSource #670

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

tim-quix
Copy link
Contributor

@tim-quix tim-quix commented Dec 3, 2024

This adds a simple queue to the FileSource connector to download the next file while the current one is being processed.

@tim-quix tim-quix added the connector Issues updating Sinks or Sources label Dec 3, 2024
@tim-quix tim-quix changed the title Source: download queue for FileSource Source: background file downloads for FileSource Dec 3, 2024
def stop(self):
logger.info("Stopping file download thread...")
self._stopped = True
self._executor.shutdown(wait=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should cancel the in-progress future before shutting down.

We also should wait=True the shutdown to make it clean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the original reason I did this was to hopefully kill the file download immediately rather than waiting to complete it, as it seems you cant in actuality cancel already running futures (you can still call cancel on it, but from what I can tell, it won't do anything).

However, I think for now I'll do a wait, and later on if we really want to we can add a mid-download stop (where you read chunks of data of the file at a time and look for stopping). I can't imagine people will be stopping in the middle very often anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case we have the source shutdown timeout

@tim-quix tim-quix force-pushed the source/file-download-queue branch from 09eb826 to 8d7cbcf Compare December 16, 2024 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
connector Issues updating Sinks or Sources
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants