diff --git a/CHANGELOG.md b/CHANGELOG.md index 2f5147f..5bb330b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -37,6 +37,13 @@ - --seconds-needed option added to justin-get-file for multi-file jobscripts - Add none_processed job state and events, and pausing of workflows if too many jobscript_error, or notused, or none_processed job outcomes. +- Add banner message from configuration +- In --output-pattern, the destination dataset is now optional, and a name + like wXXXXsYpZ will be created if not given. +- Per-RSE datasets are created, each with a rule to keep the file on that + RSE, and they are always used for Rucio uploads by jobs. +- Created datasets for output patterns have metadata describing the pattern, + and stage parameters including the memory requested, Apptainer image etc. ## 01.00.00 - The "1.0" release of justIN after DC24 diff --git a/commands/justin b/commands/justin index 1848e94..091f822 100755 --- a/commands/justin +++ b/commands/justin @@ -32,7 +32,7 @@ import argparse import platform # The make-justin-tag script looks for this and updates it - so format matters -versionNumber = '01.01.rc2' +versionNumber = '01.01.rc3' sessionFile = '/var/tmp/justin.session.' + str(os.getuid()) def body(buf): diff --git a/commands/justin-rucio-upload b/commands/justin-rucio-upload index 37b7e4f..0288fe9 100755 --- a/commands/justin-rucio-upload +++ b/commands/justin-rucio-upload @@ -149,6 +149,9 @@ for dataset in datasets[1:]: [{'scope' : args['scope'], 'name' : filename }]) + except rucio.common.exception.FileAlreadyExists: + # Ok if already exists - previous attempt? + ret = 1 except Exception as e: ret = None print('--- Rucio attach_dids %s call fails: %s' % (dataset, str(e))) diff --git a/commands/justin.1 b/commands/justin.1 index a114896..d854133 100644 --- a/commands/justin.1 +++ b/commands/justin.1 @@ -112,8 +112,8 @@ process. .B "create-stage --workflow-id ID --stage-id ID .B --jobscript FILENAME|--jobscript-git ORG/PATH:TAG .B [--wall-seconds N] [--rss-mib N] [--processors N] [--max-distance DIST] -.B [--output-pattern PATTERN:DESTINATION] -.B [--output-pattern-next-stage PATTERN:DATASET] [--output-rse NAME] +.B [--output-pattern PATTERN[:DESTINATION]] +.B [--output-pattern-next-stage PATTERN[:DATASET]] [--output-rse NAME] .B [--lifetime-days DAYS] [--env NAME=VALUE] [--classad NAME=VALUE] .br Creates a new stage for the given workflow ID with the given stage ID. Stages @@ -157,12 +157,12 @@ to allow input files to be allocated on storages at greater distances, up to a value of 100 which represents maximally remote storages. If one or more options -.B --output-pattern PATTERN:DESTINATION +.B --output-pattern PATTERN[:DESTINATION] is given then the wrapper job will look for files created by the script which match the pattern given as PATTERN. The pattern is a Bash shell pattern using *, ? and [...] expressions. See the bash(1) Pattern Matching section for details. -The DESTINATION component has any of the variables +If given, the DESTINATION component has any of the variables $JUSTIN_SCOPE, $JUSTIN_WORKFLOW_ID, or $JUSTIN_STAGE_ID replaced. The form ${JUSTIN_SCOPE} etc may also be used. If the given DESTINATION starts with https:// then the matching output files @@ -171,19 +171,26 @@ DESTINATION must be the URL of a directory accessible via WebDAV, and given with or without a trailing slash. Nested subdirectories for workflow ID and stage ID will be added, and resulting output files placed there. The user's token from the justIN dashboard is used for the upload. -If an https:// URL is not given, DESTINATION is interpreted as a +If an https:// URL is not given, if DESTINATION is given it is interpreted as a Rucio dataset minus the scope component. The overall scope of the workflow is used and the output files are uploaded with Rucio and registered in that dataset. If the dataset does not already exist then it will be created when the workflow changes state from submitted to running with a rule with a lifetime of .B --lifetime-days -days. Furthermore, files for Rucio-managed storage may have a corresponding +days. If the dataset is name is not given, a dataset with name wXXXsYpZ +will be created where XXXX is the workflow ID, Y is the stage, and Z is +the output pattern ID number, starting from 1. +Files for Rucio-managed storage may have a corresponding JSON metadata file with the same name but with ".json" appended, that will -be recorded in the metadata for that file in MetaCat. +be recorded in the metadata for that file in MetaCat. If this is not given, +then basic workflow metadata will still be recorded. If output files have +parent-child relations, the parent output pattern must be given before the +child so that the parents are known to MetaCat before the children declare +them to be parents. Alternatively -.B --output-pattern-next-stage PATTERN:DATASET +.B --output-pattern-next-stage PATTERN[:DATASET] can be given in which case the output file will be uploaded to Rucio-managed storage and will also be registered in the justIN Database as an unprocessed input file for the next stage and @@ -221,7 +228,7 @@ this stage. .B --jobscript FILENAME|--jobscript-git ORG/PATH:TAG .B [--wall-seconds N] .B [--rss-mib N] [--processors N] [--max-distance DIST] -.B [--output-pattern PATTERN:DESTINATION] [--output-rse NAME] +.B [--output-pattern PATTERN[:DESTINATION]] [--output-rse NAME] .B [--lifetime-days DAYS] [--env NAME=VALUE] [--classad NAME=VALUE] .br Combines the diff --git a/testing/make-justin-ups b/testing/make-justin-ups index c9aafb8..b2b763f 100755 --- a/testing/make-justin-ups +++ b/testing/make-justin-ups @@ -39,7 +39,7 @@ # testing/make-justin-ups --default # -export JUSTIN_VERSION=01.01.rc2 +export JUSTIN_VERSION=01.01.rc3 export MJU_GIT_DIR=`mktemp -d /tmp/mju_git_XXXXXX` ( cd $MJU_GIT_DIR