Logs generated by HTCondor after glidein start are shipped to an S3 endpoint every 5 minutes and once after the HTCondor process has been killed. The pyGlidein client generates a presigned PUT URL and a presigned GET URL for each glidein job submitted to the remote scheduler. The presigned GET URL is stored in the PRESIGNED_GET_URL
HTCondor Classad and stored in the HTCondor history file via STARTD_ATTRS
.
pyGlidein provides an AWS Cloud Formation template for generating an S3 bucket and IAM credentials for accessing the bucket. This Cloud Formation should be run by the Distributed High Throughput Cluster (DHTC) administrator for each site in the DHTC. From the cloud_formation
directory run:
create_logging_bucket.sh <BUCKET_NAME> <EMAIL>
<BUCKET_NAME>
- The name of the S3 bucket to be generated. This should match the site name where possible.
<EMAIL>
- Email address for sending S3 usage warnings. This should be the owner of the AWS account to reduce billing overages.
The output of the AWS Cloud Formation Template will be the newly created access_key
and secret_key
. This information is only accessible from the AWS Web console. Share this information with the pyGlidein site administrator securely.
- Add the
[StartdLogging]
section to the pyGlidein client configuration. Replace thebucket
configuration variable with the bucket name provided by the DHTC administrator:
[StartdLogging]
send_startd_logs = True
url = s3.amazonaws.com
bucket = pyglidein-logging-wipac-dev
- Add the
[StartdLogging]
section to the pyGlidein client secrets file. Theaccess_key
andsecret_key
will be provided by the DHTC administrator:
[StartdLogging]
access_key = XXXXXXXX
secret_key = ZZZZZZZZ
Add the [StartdLogging]
section to the pyGlidein client configuration:
[StartdLogging]
send_startd_logs = False