Skip to content

Commit

Permalink
* Update hycom.py and ww3.py to properly wait for GFS grib2 files and
Browse files Browse the repository at this point in the history
  with proper log/error information; The wallclock limits for
  jhafs_ocn_prep.ecf and jhafs_wav_prep.ecf are also updated as 30 min
* Update exhafs_output.sh to use correct grib2 idx format for the swath grib2 file
* Sync NCO SPA's (Simon Shiao) changes from HAFSv1 implementation testing
  - Update MAILTO in jhafs_output.ecf and JHAFS_OUTPUT as sdm@noaa.gov (same as HWRF)
  - Move the ecf/scripts/*.ecf into ecf/*.ecf
  - Some ecf manual information were added in jhafs_cleanup.ecf,
    jhafs_msg_check.ecf and launch/jhafs_launch.ecf
  - Latest modifications in hafs ecf suite definition files
  - Update parm/transfer/transfer_hafs_hfsb_1.list
  - Update to export NCP=cpfs for the gempak job (since the downstream
    MAG does not has the capability of wait/check the HAFS gempak products)
  • Loading branch information
BinLiu-NOAA committed Jun 2, 2023
1 parent 762dde5 commit c4df479
Show file tree
Hide file tree
Showing 50 changed files with 15,170 additions and 352 deletions.
File renamed without changes.
File renamed without changes.
510 changes: 255 additions & 255 deletions ecf/defs/hafs.def

Large diffs are not rendered by default.

7,285 changes: 7,285 additions & 0 deletions ecf/defs/hafs.def.para

Large diffs are not rendered by default.

7,411 changes: 7,411 additions & 0 deletions ecf/defs/hafs.def.prod

Large diffs are not rendered by default.

File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
14 changes: 14 additions & 0 deletions ecf/scripts/jhafs_cleanup.ecf → ecf/jhafs_cleanup.ecf
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,18 @@ ${HOMEhafs}/jobs/JHAFS_CLEANUP
%include <tail.h>

%manual
TASK cleanup

PURPOSE: Deletes the HAFS hfsa or hfsb working directory left behind by the same
cycle yesterday. This job saves a minute or so of runtime that used
to take up the beginning of the launch job.

TROBLESHOOTING:

PROBLEM: Job has hung?

This job should not take longer than 5 minutes; all it does is an "rm
-rf". If the job takes more than 5 minutes, then it is likely that
there is a filesystem problem.

%end
71 changes: 71 additions & 0 deletions ecf/jhafs_msg_check.ecf
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#PBS -N %RUN%_msg_check_%CYC%%VER:""%
#PBS -j oe
#PBS -S /bin/bash
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=00:15:00
#PBS -l place=vscatter,select=1:ncpus=1:mpiprocs=1:ompthreads=1
#PBS -l debug=true

model=hafs
export NET="%NET%"
export RUN="%RUN%"
export cyc="%CYC%"
%include <head.h>
%include <envir-p1.h>

export TOTAL_TASKS='1'
export NCTSK='1'
export OMP_THREADS='1'

module list

export EMAIL_SDM=YES
export MAILFROM=${MAILFROM:-"nco.spa@noaa.gov"}
export MAILTO=${MAILTO:-"sdm@noaa.gov,nco.spa@noaa.gov,ncep.sos@noaa.gov"}

${HOMEhafs}/jobs/JHAFS_MSG_CHECK

%include <tail.h>

%manual
TASK msg_check

PURPOSE: Check for hurricane messages.

This job will check if hurricane message files are generated in time, and if
there is active storm while the messages is not present yet in the specic time
closed to the cycle, an alert email will be sent out to SDM to alert for the
setup_hurricane.


TROUBLESHOOTING

This job will rarely fail since it has little to do; it just sets up
some directories and makes configuration files. If this job fails, it
is likely due to a hardware or other system issue, with one exception...

PROBLEM: Why is there no storm?

The launcher does whatever the NOAA SDM tells it to do. The NOAA SDM
uses a script called setup_hurricane to create message files read by
the launcher job's JHAFS_LAUNCH script. If the launcher job decided
not to run a storm, then that means the SDM told it not to. If there
should have been a storm, then it likely means there was a
communication problem, preventing data from getting from NHC or JTWC
to the NOAA SDM.

For NHC/CPHC storms, the NOAA SDM has a direct line to the on-call
NHC/CPHC person, who can confirm the absence of a storm. It may then
be possible to manually edit the message and nstorms files to add the
storm in. Alternatively, you could rerun setup_hurricane, but that
may change storm priorities. In that case, you must rerun the entire
HAFS cycle (all storm slots). All of this is technical possibility;
there may be procedural reasons why this cannot be done.

For JTWC storms, the NOAA SDM has the direct line to the JTWC duty
officer, but there isn't much that can be done. JTWC lacks any way to
send vitals after the T+3 deadline. Hence, a JTWC storm that is
missed, is missed.

%end
96 changes: 96 additions & 0 deletions ecf/launch/jhafs_launch.ecf
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#PBS -N %RUN%%STORMNUM%_launch_%CYC%%VER:""%
#PBS -j oe
#PBS -S /bin/bash
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=00:15:00
#PBS -l place=vscatter,select=1:ncpus=1:mpiprocs=1:mem=10G
#PBS -l debug=true

model=hafs
export NET="%NET%"
export RUN="%RUN%"
export cyc="%CYC%"
%include <head.h>
%include <envir-p1.h>

export storm_num="%STORMNUM%"

module load PrgEnv-intel/${PrgEnv_intel_ver}
module load craype/${craype_ver}
module load intel/${intel_ver}
module load cray-mpich/${cray_mpich_ver}
module load cray-pals/${cray_pals_ver}
module load hdf5/${hdf5_ver}
module load netcdf/${netcdf_ver}
module load python/${python_ver}
module load crtm/${crtm_ver}
module load udunits/${udunits_ver}
module load gsl/${gsl_ver}
module load nco/${nco_ver}
module list

${HOMEhafs}/jobs/JHAFS_LAUNCH

%include <tail.h>

%manual
TASK launch

PURPOSE: Creates initial directory structure and configures the
rest of the workflow for one storm.

This job will delete and recreate the $DATA work area for one HAFS
storm. It sets several flags to turn on and off parts of the
workflow, or disable the entire workflow if there is no storm. All
logic is triggered by the message file sent by the SDM via
setup_hurricane.

Labels:

stormN - storm1-storm7 for HFSA, storm1-storm5 for HFSB; label tells
whether a storm is to be run. IF the storm is to be run, it tells
what the storm is, and who sent it (NHC or JTWC).

Events:

NoStorm - set if no storm is to be run. The rest of the workflow sees
the event and automatically completes via ecFlow completion clauses.

Ocean - for both HFSA and HFSB have ocean coupling with HYCOM

Wave - for wave coupling, (only HFSA will run it for NHC basin storms)

Analysis - for both HFSA and HFSB will set it for NHC basin storms;
for HFSA with JTWC storms (data assimilation is turned off)

TROUBLESHOOTING

This job will rarely fail since it has little to do; it just sets up
some directories and makes configuration files. If this job fails, it
is likely due to a hardware or other system issue, with one exception...

PROBLEM: Why is there no storm?

The launcher does whatever the NOAA SDM tells it to do. The NOAA SDM
uses a script called setup_hurricane to create message files read by
the launcher job's JHAFS_LAUNCH script. If the launcher job decided
not to run a storm, then that means the SDM told it not to. If there
should have been a storm, then it likely means there was a
communication problem, preventing data from getting from NHC or JTWC
to the NOAA SDM.

For NHC/CPHC storms, the NOAA SDM has a direct line to the on-call
NHC/CPHC person, who can confirm the absence of a storm. It may then
be possible to manually edit the message and nstorms files to add the
storm in. Alternatively, you could rerun setup_hurricane, but that
may change storm priorities. In that case, you must rerun the entire
HAFS cycle (all storm slots). All of this is technical possibility;
there may be procedural reasons why this cannot be done.

For JTWC storms, the NOAA SDM has the direct line to the JTWC duty
officer, but there isn't much that can be done. JTWC lacks any way to
send vitals after the T+3 deadline. Hence, a JTWC storm that is
missed, is missed.

%end
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ module load wgrib2/${wgrib2_ver}
module load gempak/${gempak_ver}
module list

export NCP=cpfs

${HOMEhafs}/jobs/JHAFS_GEMPAK

%include <tail.h>
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ module list

export EMAIL_SDM=YES
export MAILFROM=${MAILFROM:-"nco.spa@noaa.gov"}
export MAILTO=${MAILTO:-"sdm@noaa.gov,nco.spa@noaa.gov,ncep.sos@noaa.gov"}
export MAILTO=${MAILTO:-"sdm@noaa.gov"}

${HOMEhafs}/jobs/JHAFS_OUTPUT

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#PBS -S /bin/bash
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=01:00:00
#PBS -l walltime=00:30:00
#PBS -l place=vscatter:excl,select=1:ncpus=40:mpiprocs=40:ompthreads=1
#PBS -l debug=true

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#PBS -S /bin/bash
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=01:00:00
#PBS -l walltime=00:30:00
#PBS -l place=vscatter:excl,select=1:ncpus=40:mpiprocs=40:ompthreads=1
#PBS -l debug=true

Expand Down
32 changes: 0 additions & 32 deletions ecf/scripts/jhafs_msg_check.ecf

This file was deleted.

38 changes: 0 additions & 38 deletions ecf/scripts/launch/jhafs_launch.ecf

This file was deleted.

6 changes: 3 additions & 3 deletions jobs/JHAFS_LAUNCH
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,9 @@ else
abort_reason="Message file cycle $mYMDH is not current cycle $PDY$cyc."
fi
# Under NCO environment, HFSB only runs NHC/CPHC storms
if [[ "${RUN_ENVIR^^}" = "NCO" ]] && [[ "${RUN^^}" = "HFSB" ]] && [[ "$center" != "NHC" ]]; then
abort_reason="The storm in $message file is not a NHC storm for ${RUN^^} to run."
fi
if [[ "${RUN_ENVIR^^}" = "NCO" ]] && [[ "${RUN^^}" = "HFSB" ]] && [[ "$center" != "NHC" ]]; then
abort_reason="The storm in $message file is not a NHC storm for ${RUN^^} to run."
fi
fi

if [[ -n "$abort_reason" ]]; then
Expand Down
2 changes: 1 addition & 1 deletion jobs/JHAFS_OUTPUT
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ export SENDDBN=${SENDDBN:-NO}
export SENDECF=${SENDECF:-NO}
export EMAIL_SDM=${EMAIL_SDM:-NO}
export MAILFROM=${MAILFROM:-"nco.spa@noaa.gov"}
export MAILTO=${MAILTO:-"sdm@noaa.gov,nco.spa@noaa.gov,ncep.sos@noaa.gov"}
export MAILTO=${MAILTO:-"sdm@noaa.gov"}
export SCRUBDATA=${SCRUBDATA:-YES}
# HAFS workflow jobs use shared working dir, and the CLEANUP or SCRUB job will clean up WORKhafs
#export KEEPDATA=${KEEPDATA:-YES}
Expand Down
2 changes: 1 addition & 1 deletion parm/transfer/transfer_hafs_hfsb_1.list
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
# directory are included, so if no exclude patterns match that file, it will be
# transferred.

_COMROOT_/hafs/_SHORTVER_/inphfsa/
_COMROOT_/hafs/_SHORTVER_/inphfsb/
B 20

_COMROOT_/hafs/_SHORTVER_/hfsb._PDYm1_/00/
Expand Down
3 changes: 2 additions & 1 deletion scripts/exhafs_gempak.sh
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,8 @@ ${GEMEXE}/gpend
status=$?; [[ $status -ne 0 ]] && exit $status

if [ "${SENDCOM^^}" = "YES" ]; then
${NCP} -p $GDOUTF $GPOUTF
${NCP} $GDOUTF $GPOUTF
# ${NCP} -p $GDOUTF $GPOUTF
if [ "${SENDDBN^^}" = "YES" ]; then
$DBNROOT/bin/dbn_alert MODEL ${RUN^^}_GEMPAK $job $GPOUTF
fi
Expand Down
2 changes: 1 addition & 1 deletion scripts/exhafs_output.sh
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ done
${WGRIB2} ${swath_grb2file} -set_grib_type c2 -grib_out ${swath_grb2file}.c2
mv ${swath_grb2file}.c2 ${swath_grb2file}
# Generate the index file for the swath grib2 file
${GRB2INDEX} ${swath_grb2file} ${swath_grb2indx}
${WGRIB2} -s ${swath_grb2file} > ${swath_grb2indx}

# Deliver to COMhafs
if [ $SENDCOM = YES ]; then
Expand Down
13 changes: 9 additions & 4 deletions ush/hafs/hycom.py
Original file line number Diff line number Diff line change
Expand Up @@ -950,6 +950,8 @@ def getges1(self,atmosds,grid,time):
atime=atime0
rinput=self.rtofs_inputs
glocset=0
maxwait=self.confint('max_grib_wait',900)
sleeptime=self.confint('grib_sleep_time',20)
for itry in range(0,-10,-1):
if time>atime+epsilon:

Expand All @@ -964,18 +966,21 @@ def getges1(self,atmosds,grid,time):
repr(atmosds),repr(grid),repr(time),
repr(gloc)))
return (gloc)
if itry<=-9:
if wait_for_files([gloc],logger=logger,maxwait=60,sleeptime=5):
if itry<=0:
if wait_for_files([gloc],logger=logger,maxwait=maxwait,sleeptime=sleeptime):
logger.info('%s %s %s => %s'%(
repr(atmosds),repr(grid),repr(time),
repr(gloc)))
return (gloc)
else:
logger.warning('%s : do not exist or empty'%(gloc))
msg='FATAL ERROR: %s: did not exist or was too small after %d seconds'%(gloc,maxwait)
logger.error(msg)
raise hafs.exceptions.NoOceanData(msg)
sys.exit(2)
else:
logger.warning('%s<=%s+%s'%(repr(time),repr(atime),repr(epsilon)))
atime=atime-sixhrs
msg='Cannot find file for time %s; first file tried %s'%(time.strftime('%Y%m%d%H'),gloc0)
msg='FATAL ERROR: Cannot find file for time %s; first file tried %s'%(time.strftime('%Y%m%d%H'),gloc0)
self.log().error(msg)
raise hafs.exceptions.NoOceanData(msg)

Expand Down
Loading

0 comments on commit c4df479

Please sign in to comment.