-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kestrel #405
Kestrel #405
Changes from all commits
550237b
db1c98a
4e92040
a7ed663
a8a63a1
0535eb6
30bebb7
b205ac0
041d2ee
8bebd80
121b940
cb32b7d
db96996
d9dbe3f
e8d1ea6
9a3446e
d70f76e
54740fe
7d853ef
161edc3
3461316
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
#!/bin/bash | ||
#SBATCH --nodes=1 | ||
#SBATCH --ntasks=1 | ||
#SBATCH --tmp=1000000 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line tells slurm to give us a node with |
||
|
||
echo "Job ID: $SLURM_JOB_ID" | ||
echo "Hostname: $HOSTNAME" | ||
echo "QOS: $SLURM_JOB_QOS" | ||
|
||
df -i | ||
df -h | ||
|
||
module load python apptainer | ||
source "$MY_PYTHON_ENV/bin/activate" | ||
Comment on lines
+13
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You'll notice I abandoned conda as our python package and environment manager. There was too much trouble between it and pip when installing buildstockbatch. I opted to go with the system installed python (3.11) and use a venv. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm trying to figure out if we actually still used ruby native outside of the container but it looks like not... |
||
|
||
time python -u -m buildstockbatch.hpc kestrel "$PROJECTFILE" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
#!/bin/bash | ||
#SBATCH --tmp=1000000 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line tells slurm to give us nodes with |
||
|
||
echo "begin kestrel_postprocessing.sh" | ||
|
||
echo "Job ID: $SLURM_JOB_ID" | ||
echo "Hostname: $HOSTNAME" | ||
|
||
df -i | ||
df -h | ||
|
||
module load python apptainer | ||
source "$MY_PYTHON_ENV/bin/activate" | ||
|
||
export POSTPROCESS=1 | ||
|
||
echo "UPLOADONLY: ${UPLOADONLY}" | ||
echo "MEMORY: ${MEMORY}" | ||
echo "NPROCS: ${NPROCS}" | ||
|
||
SCHEDULER_FILE=$OUT_DIR/dask_scheduler.json | ||
|
||
echo "head node" | ||
echo $SLURM_JOB_NODELIST_PACK_GROUP_0 | ||
echo "workers" | ||
echo $SLURM_JOB_NODELIST_PACK_GROUP_1 | ||
|
||
pdsh -w $SLURM_JOB_NODELIST_PACK_GROUP_1 "free -h" | ||
pdsh -w $SLURM_JOB_NODELIST_PACK_GROUP_1 "df -i; df -h" | ||
|
||
$MY_PYTHON_ENV/bin/dask scheduler --scheduler-file $SCHEDULER_FILE &> $OUT_DIR/dask_scheduler.out & | ||
pdsh -w $SLURM_JOB_NODELIST_PACK_GROUP_1 "$MY_PYTHON_ENV/bin/dask worker --scheduler-file $SCHEDULER_FILE --local-directory /tmp/scratch/dask --nworkers ${NPROCS} --nthreads 1 --memory-limit ${MEMORY}MB" &> $OUT_DIR/dask_workers.out & | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This wasn't working for me when the python environment was on |
||
|
||
time python -u -m buildstockbatch.hpc kestrel "$PROJECTFILE" |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,7 @@ weather_files_url: str(required=False) | |
sampler: include('sampler-spec', required=True) | ||
workflow_generator: include('workflow-generator-spec', required=True) | ||
eagle: include('hpc-spec', required=False) | ||
kestrel: include('hpc-spec', required=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just add a |
||
aws: include('aws-spec', required=False) | ||
output_directory: regex('^(.*\/)?[a-z][a-z0-9_]*\/?$', required=True) | ||
sys_image_dir: str(required=False) | ||
|
@@ -48,7 +49,7 @@ hpc-spec: | |
hpc-postprocessing-spec: | ||
time: int(required=True) | ||
n_workers: int(min=1, max=32, required=False) | ||
node_memory_mb: enum(85248, 180224, 751616, required=False) | ||
node_memory_mb: int(min=85248, max=751616, required=False) | ||
n_procs: int(min=1, max=36, required=False) | ||
parquet_memory_mb: int(min=100, max=4096, required=False) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this file from
eagle.py
➡️hpc.py
.