[Gate-users] gatejobsplitter & slurm

Kevin Kramer kevin.kramer at cern.ch
Wed Nov 11 18:30:18 CET 2015


Hi Neville,

I ran into a similar problem and never managed to actually use the GJS on a cluster organized by slurm.
Here's what I ended up doing instead:

Step 1)
Split up the simulation into N parts that each run for 1/N times the total simulated time (I do this by just creating N macro files that differ only in the start_time and stop_time values and the output file names).

Step 2)
Create a slurm multi-prog config file. The syntax of this file is simple:
<processor id> <command to be executed>
In my case it looks something like this:
0 Gate run0.mac
1 Gate run1.mac
2 Gate run2.mac
3 Gate run3.mac
[...]
N-1 Gate run<N-1>.mac

Step 3)
Create the usual sbatch script with all the slurm parameters etc. and add the line
srun --multi-prog <your_config_file_name>
and submit it with the sbatch command. This will then read the config file and start the processes as defined in there.


Of course you can automate this whole process, the exact implementation obviously depends on how you've set up your simulation.
Note that you will of course get N output files and you might want to merge them afterwards. For root files you can just use the hadd utility, for ASCII output I simply concatenate the files. I have not tried any other outputs.
Another issue might be that the eventID starts from 0 for every process. Since that has been irrelevant for me I have not thought about a way to change this. I'd be interested to hear if anyone came up with a fix for this, though.

Hope this approach will work for you!
Best,
Kevin


On Tue Nov 10 21:20:40 CET 2015 Neville Eclov wrote:


Dear Gate users:

I am hoping to use the GateJobSplitter (GJS) on our computing cluster, but
the cluster uses slurm, which isn't one of the 4 clusterplatform options
listed by the GJS manual. Has anyone made a version that works with slurm?
If so, would you be willing to send it my way? Or is there some other way
to do this for slurm?

Thanks in advance for your time/effort!

Best,

Neville Eclov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opengatecollaboration.org/mailman/private/gate-users/attachments/20151111/edcf5477/attachment.html>


More information about the Gate-users mailing list