[Gate-users] Condor_hold and condor_release GATE simulation on a cluster

Zhengzhi Liu zliu36 at stanford.edu
Thu Apr 9 18:44:34 CEST 2020


Hi Mathieu,

Thank you very much for the excellent answer! I tried the combination of
condor_suspend and condor_continue. They worked very well and did
exactly what their name suggests. Condor_hold and condor_release do a
little different job.

Thank you very much.
Sincere greetings,

Zhengzhi

On Thu, Apr 9, 2020 at 12:02 AM Mathieu Dupont <mdupont at cppm.in2p3.fr>
wrote:

> Hi,
>
> Without Condor, my first idea would be to send SIGTSTP signal to
> yours GATE processes runned by condor. And SIGCONT signal to resume
> them.
>
> And by looking at condor documentation, i found command condor_suspend
> (https://htcondor.readthedocs.io/en/stable/man-pages/condor_suspend.html)
> and condor_continue
> (https://htcondor.readthedocs.io/en/stable/man-pages/condor_continue.html)
> which seem to do it.  Maybe can you try them ?
>
>
>
> --------
> On Wed, 8 Apr 2020 15:46:52 -0700
> Zhengzhi Liu <zliu36 at stanford.edu> wrote:
>
> > Dear Gate users,
> >
> > For some GATE simulation, the runtime could be as long as a couple of
> > days even on a 56 cores cluster. However, I can't let my GATE
> > simulation occupy all the cores on the cluster during working hours
> > since other colleagues are also using the machine. Thus I tried to
> > hold my GATE simulation during the working hours and later resume
> > previous GATE simulation. The commands I found to achieve this goal
> > are condor_hold
> > <https://www.cl.cam.ac.uk/manuals/condor-V6_8_3-Manual/condor_hold.html>
> > and condor_release
> > <
> https://www.cl.cam.ac.uk/manuals/condor-V6_8_3-Manual/condor_release.html#man-condor-release
> >.
> > Everything works fine that condor_hold can put my GATE jobs on hold
> > and condor_relese can resume GATE simulation. Except that running
> > condor_release would wipe existed data.
> >
> > I might have misunderstood the function of condor_hold. Honestly, I
> > don't fully understand the description. It might have killed the GATE
> > program. Are there any GATE experts who know how to pause GATE
> > simulation and resume it at a later time? If this is possible.
> >
> > Thank you very much for any help.
> > Sincere wishes,
> >
> > Zhengzhi
>
>
>
> --
> Mobilisé contre la réforme des retraites et la LPPR
> --
> Mathieu Dupont - Ingénieur de Recherche
> CENTRE DE PHYSIQUE DES PARTICULES DE MARSEILLE
> UMR 7346 - Aix-Marseille Université - CNRS/IN2P3
> 163 avenue de Luminy, Case 902, F -13288 Marseille CEDEX 09
> Tél.: +33 (0) 4 91 82 72 19
> Site : cppm.in2p3.fr - Email : mdupont at cppm.in2p3.fr
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opengatecollaboration.org/pipermail/gate-users/attachments/20200409/a3b91da0/attachment.html>


More information about the Gate-users mailing list