[Gate-users] Notes on GATE + Docker + HT Condor
Miller, Michael
Michael.a.Miller at philips.com
Tue Jan 22 16:26:26 CET 2019
Hi All,
I’m using the GATE docker container with HT condor, and want to pass along how I’m managing it. It works very well for a condor cluster with completely independent nodes, with no need to share users, disks, or executables. The sequence I’m using goes like this:
1. Create GATE macros, using aliases for anything you want to vary between jobs, including the simulation start and stop times
2. Write a shell script that will run the gate job inside the container. This script will
a. Perform any configuration needed to run GATE
b. Run GATE with the –a option to set aliases
3. Write one or more condor job files (*.submit) that
a. Specifies the condor docker universe and docker image
b. Defines the output, log and error files for the job
c. Defines the files to be transferred into the container before running the gate job
d. Defines the files to be transferred out of the container after running the gate job
4. Submit the jobs to condor
5. Condor handles transferring files in and out of the container, and running the jobs
I automate this with a python script that writes the shell scripts, the job files and submits condor jobs. Such python serve the purpose of the GATE job splitter, gjs, plus allow for additional control if there is more than start and stop times that you’d like to vary among jobs, be it phantom geometry or activity or whatever you think up.
Example shell script that will run the gate job inside the container (split000.sh):
#!/usr/bin/bash --login
echo running $0 on `hostname` at `date`
cd `dirname $0`
echo pwd = `pwd`
source setup_gate_container_env.sh
Gate –a [SimulationStartTime,0.0][SimulationStopTime,1[OutputBaseName,simulation-split000] simulation.mac
Example condor job file (split000.submit):
universe = docker
executable = split000.sh
docker_image = opengatecollaboration/gate
output = simulation-split000.out
error = simulation-split000.err
log = simulation-split000.log
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
transfer_input_files = <comma-separated list of all files needed>
transfer_output_files = split000.root
queue
Transfer_input_files is a comma-separated list of everything needed to run the macro simulation.mac. It must include all macros that you might /control/execute, the materials database file, and other input files, including setup_gate_container_env.sh which is sourced by the shell script.
Transfer_output_files is a comma-separated list of everything you’d like to have copied out of the container after the script has completed. Here it is just a split000.root, but it could include other output files or visualization results.
After submitting to condor with condor_submit split000.submit, the job runs and all the file copying in and out of container is handled by condor.
Now, I don’t create these scripts and submit files by hand – I use a python script that looks like the following (sumit-splits.py):
#!/usr/bin/env python
import glob
import os
import stat
import subprocess
import time
# Simulation start time [s]
start_time = 0.0
# Simulation stop time [s]
stop_time = 300.0
# Number of jobs to submit to condor
Nsplits = 600
split_duration = (stop_time - start_time)/Nsplits
basename = ' simulation'
GATEMacro = basename + '.mac'
for i in range(Nsplits):
t1 = i*split_duration
t2 = t1 + split_duration
print "Writing scripts for split #", i, 'of', Nsplits
OutputBaseName = basename + '-split' + str(i).zfill(len(str(Nsplits)))
script_file = OutputBaseName + '.sh'
submit_file = OutputBaseName + '.submit'
log_file = OutputBaseName + '.log'
output_file = OutputBaseName + '.out'
error_file = OutputBaseName + '.err'
# Write the script that will be run in the docker container:
print "Writing", script_file
f = open(script_file, 'w')
f.write("#!/usr/bin/bash --login\n")
f.write("echo running $0 on `hostname` at `date`\n")
f.write("cd `dirname $0`\n")
f.write("echo pwd = `pwd`\n")
f.write("source setup_gate_container_env.sh\n")
f.write("Gate -a ")
f.write("[SimulationStartTime," + str(t1) + "]")
f.write("[SimulationStopTime," + str(t2) + "]")
f.write("[OutputBaseName," + OutputBaseName + "]")
f.write(" " + GATEMacro + "\n")
f.close()
# Set permissions so that condor can execute the script:
stats = os.stat(script_file)
os.chmod(script_file, stats.st_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
# Make a list of macros to be transfered for the condor jobs.
# Since condor's transfer_input_files cannot handle wild cards,
# we'll create a comma-separated list here:
input_files = []
input_files.append('GateMaterials.db')
input_files.append('setup_gate_container_env.sh')
for macro_file in glob.glob('*.mac'):
input_files.append(macro_file)
# Write the condor submit file:
print "Writing", submit_file
f = open(submit_file, 'w')
f.write("universe = docker\n")
f.write("executable = " + script_file + "\n")
f.write("docker_image = opengatecollaboration/gate\n")
f.write("output = " + output_file + "\n")
f.write("error = " + error_file + "\n")
f.write("log = " + log_file + "\n")
f.write("should_transfer_files = YES\n")
f.write("when_to_transfer_output = ON_EXIT\n")
f.write("transfer_input_files = " + ','.join(input_files) + "\n")
f.write("transfer_output_files = " + OutputBaseName + ".root\n")
f.write("queue\n")
f.close()
# Submit the job to condor:
print "Submitting", submit_file
subprocess.call(["condor_submit", submit_file])
Hope some of you find this helpful.
Mike
________________________________
The information contained in this message may be confidential and legally protected under applicable law. The message is intended solely for the addressee(s). If you are not the intended recipient, you are hereby notified that any use, forwarding, dissemination, or reproduction of this message is strictly prohibited and may be unlawful. If you are not the intended recipient, please contact the sender by return e-mail and destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opengatecollaboration.org/pipermail/gate-users/attachments/20190122/fad77f44/attachment-0001.html>
More information about the Gate-users
mailing list