Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit command #6

Merged
merged 41 commits into from
Apr 30, 2024
Merged

Submit command #6

merged 41 commits into from
Apr 30, 2024

Conversation

trey-stafford
Copy link
Member

@trey-stafford trey-stafford commented Apr 30, 2024

Submitting the seal tags recipe now works.

There are still some issues:

  • Job has to be manually cleaned up even after a successful completion.
  • Logging around bash commands is poor. Currently code writes out log files of stdout and stderr to /data/ and prints the contents of those files. This works OK but there is no real logging around the outputs captured in these logs so it is unclear when the logged messages were generated. The persistent files written to /data/ currently just get appended to when the job is run more than once. Manual cleanup should be replaced by something automated.
  • wget fetch step will happily re-download the source data file every time the job is submitted (appending a number to the end of the filename). For that matter, none of the output dirs are unique to a given run, so commands that fail without e.g., --overwrite will fail on re-runs unless we do manual cleanup.

trey-stafford and others added 30 commits April 25, 2024 13:55
Incompatible version. And `click` v8 lib already has types.
* Explicitly creating subdirs for each command step seems wrong. Is there a
  better way?
* We think data
  futures (https://parsl.readthedocs.io/en/stable/userguide/futures.html#datafutures)
  might be the "right" thing to use for chaining commands instead of calling
  `result()` for each, but that requires us to know in advance what the output
  filename of each command id. Are there cases where a command could output
  multiple files?
Unfortunately, this isn't quite what we were looking for. We actually
want to `apply`, not `create`, because with `create` (the imperative
method) we need to deal with cleanup of the previous object, or patch
it. With `apply` (the declarative method), we specify what we want and
kubernetes takes care of the rest.
NOTE: Some imports still need fixing
`curl` complains when the URL doesn't contain a filename that it can use when
given a directory with `-O`. `wget` can fetch the remote resource and figure out
the filename when given a dir to write to with `-P`
Raises an error when variables are unset.
This feels increasingly "wrong". By injecting values into templates representing
python code, we don't gain any of the advantages of our usual tooling around
types, code formatting, syntax checking, etc. We only find out about e.g.,
syntax errors and incorrectly formatted function calls (kwarg vs positional, as
in this case) until we try to execute the program.
@mfisher87
Copy link
Member

What do you think of merging everything to main at this stage?

@mfisher87 mfisher87 changed the title Submit command trey Submit command Apr 30, 2024
@mfisher87 mfisher87 changed the base branch from submit-command to main April 30, 2024 23:10
@mfisher87 mfisher87 merged commit ab73386 into main Apr 30, 2024
10 checks passed
@mfisher87 mfisher87 deleted the submit-command-trey branch April 30, 2024 23:11
@mfisher87 mfisher87 linked an issue May 1, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ogdc-runner initial spike
2 participants