[Gate-users] Adding G4MTRunManager Support to GATE
David Sarrut
David.Sarrut at creatis.insa-lyon.fr
Wed Apr 1 09:57:41 CEST 2015
Hello Marc and Alex,
first, thanks to Alex to facing this challenge, and also thanks Marc for
advices !
As I said before, we currently have no ressources to do this job within the
collaboration, but we follow what you are doing.
Good luck !
David
PS: I highly recommend to create a new branch in your git repository to
keep track of all your changes. It will be the only proper way to integrate
your code later.
On Wed, Apr 1, 2015 at 9:49 AM, Marc Verderi <verderi at in2p3.fr> wrote:
> Hi Alex,
>
> The part you wrote in the GateActionInitialization class in fine to me. I
> suspect that the problem (please remember I don't know GATE code) may come
> from:
>
> new GateUserActions( runManager, myRecords );
>
> given the runManager is passed to the class, it may set the event action
> internally, using the non-MT methods ?
>
> Cheers,
> Marc
>
>
> On 03/31/2015 08:55 PM, Alex Vergara Gil wrote:
>
>> Dear Marc
>>
>> I've managed to update a few things, however the same message appears
>> when running Gate, I can't manage myself to remove it.
>> [G4-cerr]
>> -------- EEEE ------- G4Exception-START -------- EEEE -------
>> *** G4Exception : Run3011
>> issued by : G4MTRunManager::SetUserAction()
>> For multi-threaded version, define G4UserEventAction in
>> G4VUserActionInitialization.
>> *** Fatal Exception *** core dump ***
>> -------- EEEE -------- G4Exception-END --------- EEEE -------
>>
>> [G4-cerr]
>> [G4-cerr] *** G4Exception: Aborting execution ***
>> Abortado (`core' generado)
>>
>> I suspect something is not initialized properly
>> Regards
>>
>> Alex
>>
>> 2015-03-31 12:09 GMT-04:00, Alex Vergara Gil <alexvergaragil at gmail.com>:
>>
>>> Dear Marc
>>>
>>> Thanks for your support, I will study these recommendations and let
>>> you know as soon as I get something new.
>>>
>>> Regards
>>> Alex
>>>
>>> 2015-03-31 11:22 GMT-04:00, Marc Verderi <verderi at in2p3.fr>:
>>>
>>>> Dear Alex,
>>>>
>>>> Thank for your work and message. I see you got the bulk of the MT.
>>>> I put below several things to look at / consider. Please note I know
>>>> mostly nothing on Gate code itself...
>>>>
>>>> To summarize the issues, the "pCallbackMan" and "recorder"
>>>> arguments in the action initialization need some design consideration as
>>>> threads will very likely conflict on these objects. I would guess that
>>>> the most significant issues will be here. Please see below for more
>>>> details.
>>>>
>>>> I'll be happy to help more if I can, or involve some of the G4
>>>> experts on MT if needed !
>>>>
>>>> Cheers,
>>>> Marc
>>>>
>>>>
>>>> o The lines:
>>>> G4int nThreads = G4Threading::G4GetNumberOfCores();
>>>> runManager->SetNumberOfThreads(nThreads); // Is equal to 2 by
>>>> default
>>>>
>>>> are correct. Please note that you may use this number as a maximum
>>>> number of threads, not the number by default.
>>>> For debugging purpose, I would suggest to start with 2 cores only,
>>>> and when the case looks clean, augment the number of cores.
>>>>
>>>> o The line:
>>>> runManager->SetUserInitialization(
>>>> GatePhysicsList::GetInstance() );
>>>>
>>>> looks correct to me.
>>>> One question is that if Gate had implemented "home made physics
>>>> processes" (G4VProcess) in this physics list ? If so, they should comply
>>>> to the new G4VProcess interface, which has methods for the MT case.
>>>>
>>>> o There are several lines with potential problems (I gather together
>>>> lines concerned):
>>>> // Set the Basic ROOT Output
>>>> GateRecorderBase* myRecords = 0;
>>>>
>>>> --> ** ROOT is not thread safe ! ** For this reason, Geant4 is
>>>> providing, in the "analysis" package, many, but not all, of the ROOT
>>>> functionalities to create histograms and trees. The histograms are
>>>> filled in each thread, and their content are merged at the end of the
>>>> job. The trees are dumped individually by the threads (not merged) and
>>>> should be analyzed using a chain.
>>>> In a first stage, I would recommend to switch off the recording,
>>>> to
>>>> get rest of the machinery right, and then include the output
>>>> functionalities.
>>>>
>>>>
>>>> // Set the users actions to handle callback for actors - before
>>>> the
>>>> initialisation
>>>> GateUserActions* myActions = new GateUserActions( runManager,
>>>> myRecords );
>>>> runManager->SetUserInitialization( new GateActionInitialization(
>>>> myActions, myRecords ) );
>>>> and the constructor:
>>>> GateActionInitialization(GateUserActions * cbm, GateRecorderBase
>>>> *
>>>> r);
>>>> with the lines in the GateActionInitialization class with
>>>> arguments
>>>> "pCallbackMan, recorder", specially in the Build() method.
>>>>
>>>> --> very likely this will not work. 'pCallbackMan' and 'recorder'
>>>> are the same objects, share among the threads the way they are created
>>>> and passed to the action initialization. What will happen is that they
>>>> will be messaged at whatever times by the threads during the event loop
>>>> -thread 1 is calling method a() and while a() is processed, method b()
>>>> is called by thread 2, and thread 3 re-calls a(), still processed by
>>>> thread 1; if inside these methods data members are changed, this will
>>>> result in a unpredictable behavior-. The recorder, I understand is the
>>>> ROOT based class, should be redesigned using "analysis" to avoid these
>>>> conflicts and one instance of it (a priori) should be made per thread to
>>>> make the recording independent among these threads.
>>>> For pCallbackMan, I admit my ignorance. It looks to be a
>>>> configuration class (correct ?) being read only at that time. Is this
>>>> correct ? If so, this should not be too problematic. But certainly, some
>>>> iteration is needed here.
>>>>
>>>> o At first sight, the rest looks fine to me. One comment is that the
>>>> "action initialisation" mechanism works also for the usual G4RunManager,
>>>> so that some #ifdef ... #endif could be removed. In the G4RunManager
>>>> case, the BuildForMaster() is ignored.
>>>>
>>>>
>>>>
>>>>
>>>> On 03/31/2015 02:57 PM, Alex Vergara Gil wrote:
>>>>
>>>>> Dear All
>>>>>
>>>>> I have managed to create a patch that makes gate using G4MTRunManager,
>>>>> it compiles fine and run, but I am facing that it doesn't run in
>>>>> several threads, I need somebody who guide me in the right direction.
>>>>>
>>>>> Dear Marc
>>>>>
>>>>> Thank a lot for your suggestions they help me a lot in creating this
>>>>> patch. Can you or some G4 member take a look to this and see what is
>>>>> happening here.
>>>>>
>>>>> Regards
>>>>> Alex
>>>>>
>>>>> PS: Dear Marc, sorry for double mail you, I missed to check the
>>>>> respond to all tick.
>>>>>
>>>>> 2015-03-26 9:31 GMT-04:00, Marc Verderi <verderi at in2p3.fr>:
>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> The interest of G4MTRunManager is that the geometry and the
>>>>>> cross-section tables are shared among the threads. For big
>>>>>> applications
>>>>>> -and simulation of phantom irradiation is one example- this represents
>>>>>> a
>>>>>> large memory. For machines with many cores, spawning N jobs of such
>>>>>> application may exhaust the memory, preventing to use all the
>>>>>> available
>>>>>> cores. By sharing geometry and cross-section tables, the
>>>>>> G4MTRunManager
>>>>>> saves a large fraction of memory, allowing to use much more cores.
>>>>>> Some
>>>>>> tests have been done by Geant4, on Xeon Phi, see for example
>>>>>> https://twiki.cern.ch/twiki/bin/view/Geant4/
>>>>>> MultiThreadingTaskForce#CPU_and_Memory_Performances
>>>>>> and one single application of high energy physics type (simplified CMS
>>>>>> simulation) could run smoothly with 240 threads, the maximum available
>>>>>> (the machine has 60 user cores, up to 4 thread/core). Without MT, just
>>>>>> spawning jobs, only ~30 jobs could have been run in parallel, leaving
>>>>>> 30
>>>>>> cores unoccupied, because of lack of memory !
>>>>>>
>>>>>> Moving to multi-threading has some constraints. Each thread
>>>>>> processes a bunch of events. Events are hence generated and processed
>>>>>> in
>>>>>> parallel, independently. This means that primary generator action,
>>>>>> event
>>>>>> action, stepping action have to have independent instances in each
>>>>>> thread. This is the very purpose of the new class
>>>>>> G4VUserActionInitialization : the method "Build()" is called for each
>>>>>> thread, to instantiate in each of these the above actions. For the run
>>>>>> action it is a bit more complicated : a run action may be for the
>>>>>> entire
>>>>>> application, or may be for each thread. For an "all application"
>>>>>> action,
>>>>>> BuildForMaster() has to be used.
>>>>>> This independence of threads has a similar impact on sensitive
>>>>>> detectors : for these, the G4VUserDetectorConstruction class has a new
>>>>>> method : ConstructSDandField(). Again, this method is called for each
>>>>>> thread, so that sensitive detectors and fields live independent lives
>>>>>> in
>>>>>> the various threads.
>>>>>> This looks quite work, but is not that heavy in practice.
>>>>>>
>>>>>> In practice also, what has to be taken care of in your code are
>>>>>> "static" variables : at each occurence of a static variable, you have
>>>>>> to
>>>>>> think if this variable has to be common to the entire application -a
>>>>>> "true" static-, or if it is common the thread only : "thread local"
>>>>>> static. In most of the cases, static variables are static to the
>>>>>> thread.
>>>>>> For the case of a true "static", be aware that this means that each
>>>>>> thread may access the variable, at any time. If this variable is read
>>>>>> &
>>>>>> write during the processing, it will have a quite unpredictable
>>>>>> behavior, and this is a source of debugging headaches ;) . Any random
>>>>>> crash -which are often non-reproducible between two processing- signs
>>>>>> this sort of conflict.
>>>>>>
>>>>>> Most of the G4 examples (basic, extended) are provided in MT
>>>>>> mode,
>>>>>> and are good starting points.
>>>>>>
>>>>>> Hope this helps.
>>>>>>
>>>>>> Cheers,
>>>>>> Marc (a G4 member)
>>>>>>
>>>>>>
>>>>>> On 03/26/2015 01:02 PM, Alex Vergara Gil wrote:
>>>>>>
>>>>>>> Dear All
>>>>>>>
>>>>>>> I started this thread to unify all those enthusiast people who want
>>>>>>> to
>>>>>>> add G4MTRunManager support into GATE. The advantages of a Multi
>>>>>>> Threading Run Manager are obvious but I will explain it here anyway,
>>>>>>> I
>>>>>>> will send you also my first patch and the problems I am facing.
>>>>>>>
>>>>>>> Advantages
>>>>>>> 1. You will not depend on an external cluster software to run on a
>>>>>>> single multi cpu PC.
>>>>>>> 2. The time you need for simulation scales linearly with the number
>>>>>>> of
>>>>>>> cpus
>>>>>>> 3. You doesn't need to merge the outputs, since this is performed
>>>>>>> automatically
>>>>>>> 4. Any other you may add
>>>>>>>
>>>>>>> My first patch
>>>>>>>
>>>>>>> <start of the code>
>>>>>>>
>>>>>>> Autor: Alex Vergara Gil <alexvergaragil at gmail.com> 2015-03-25
>>>>>>> 17:13:45
>>>>>>> Committer: Alex Vergara Gil <alexvergaragil at gmail.com> 2015-03-25
>>>>>>> 17:13:45
>>>>>>> Padre: db6875e64d60ad1e0f2d100c496843632acb23c8 (Merge
>>>>>>> https://github.com/OpenGATE/Gate)
>>>>>>> Hija: 28c338cd3263108df3927db14c6975f4cdcc31b4 (Agregado el
>>>>>>> UserActionInitialization)
>>>>>>> Rama: partopc
>>>>>>> Sigue-a:
>>>>>>> Precede-a:
>>>>>>>
>>>>>>> trying g4mtRunManager
>>>>>>>
>>>>>>> ------------------- source/general/include/GateRunManager.hh
>>>>>>> -------------------
>>>>>>> index c4164d9..b72327b 100644
>>>>>>> @@ -28,12 +28,19 @@
>>>>>>> #define GateRunManager_h 1
>>>>>>>
>>>>>>> #include "G4RunManager.hh"
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> + #include "G4MTRunManager.hh"
>>>>>>> +#endif
>>>>>>> #include "GateHounsfieldToMaterialsBuilder.hh"
>>>>>>>
>>>>>>> class GateRunManagerMessenger;
>>>>>>> class GateDetectorConstruction;
>>>>>>>
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> +class GateRunManager : public G4MTRunManager
>>>>>>> +#else
>>>>>>> class GateRunManager : public G4RunManager
>>>>>>> +#endif
>>>>>>> {
>>>>>>> public:
>>>>>>> //! Constructor
>>>>>>> @@ -60,8 +67,11 @@ public:
>>>>>>>
>>>>>>> //! Return the instance of the run manager
>>>>>>> static GateRunManager* GetRunManager()
>>>>>>> + #ifdef G4MULTITHREADED
>>>>>>> + { return
>>>>>>> dynamic_cast<GateRunManager*>(G4MTRunManager::GetRunManager()); }
>>>>>>> + #else
>>>>>>> { return
>>>>>>> dynamic_cast<GateRunManager*>(G4RunManager::GetRunManager());
>>>>>>> }
>>>>>>> -
>>>>>>> + #endif
>>>>>>> bool GetGlobalOutputFlag() { return mGlobalOutputFlag; }
>>>>>>> void EnableGlobalOutput(bool b) { mGlobalOutputFlag = b; }
>>>>>>> void SetUserPhysicList(G4VUserPhysicsList * m) {
>>>>>>> mUserPhysicList
>>>>>>> =
>>>>>>> m;
>>>>>>> }
>>>>>>>
>>>>>>> --------------------- source/general/src/GateRunManager.cc
>>>>>>> ---------------------
>>>>>>> index 2604e47..75b3fb5 100644
>>>>>>> @@ -8,6 +8,9 @@
>>>>>>>
>>>>>>>
>>>>>>> #include "GateRunManager.hh"
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> + #include "G4MTRunManager.hh"
>>>>>>> +#endif
>>>>>>> #include "GateDetectorConstruction.hh"
>>>>>>> #include "GateRunManagerMessenger.hh"
>>>>>>> #include "GateHounsfieldToMaterialsBuilder.hh"
>>>>>>> @@ -27,7 +30,11 @@
>>>>>>> #endif
>>>>>>>
>>>>>>>
>>>>>>> //----------------------------------------------------------
>>>>>>> ------------------------------
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> +GateRunManager::GateRunManager():G4MTRunManager()
>>>>>>> +#else
>>>>>>> GateRunManager::GateRunManager():G4RunManager()
>>>>>>> +#endif
>>>>>>> {
>>>>>>> pMessenger = new GateRunManagerMessenger(this);
>>>>>>> mHounsfieldToMaterialsBuilder = new
>>>>>>> GateHounsfieldToMaterialsBuilder();
>>>>>>> @@ -112,7 +119,11 @@ void GateRunManager::InitializeAll()
>>>>>>>
>>>>>>> G4ProductionCutsTable::GetProductionCutsTable()->
>>>>>>> GetHighEdgeEnergy());
>>>>>>>
>>>>>>> // Initialization
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> + G4MTRunManager::SetUserInitialization(mUserPhysicList);
>>>>>>> +#else
>>>>>>> G4RunManager::SetUserInitialization(mUserPhysicList);
>>>>>>> +#endif
>>>>>>>
>>>>>>> //To take into account the user cuts (steplimiter and special
>>>>>>> cuts)
>>>>>>> #if (G4VERSION_MAJOR > 9)
>>>>>>> @@ -126,7 +137,11 @@ void GateRunManager::InitializeAll()
>>>>>>> } // End if (mUserPhysicListName != "")
>>>>>>>
>>>>>>> // InitializePhysics
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> G4RunManager::InitializePhysics();
>>>>>>> +#else
>>>>>>> + G4MTRunManager::InitializePhysics();
>>>>>>> +#endif
>>>>>>>
>>>>>>> // Take into account the em option set by the user (dedx bin
>>>>>>> etc)
>>>>>>> GatePhysicsList::GetInstance()->SetEmProcessOptions();
>>>>>>> @@ -169,7 +184,11 @@ void GateRunManager::InitGeometryOnly()
>>>>>>> if (!geometryInitialized)
>>>>>>> {
>>>>>>> GateMessage("Core", 1, "Initialization of geometry" <<
>>>>>>> G4endl);
>>>>>>> +#ifdef G4MULTITHREADED
>>>>>>> + G4MTRunManager::InitializeGeometry();
>>>>>>> +#else
>>>>>>> G4RunManager::InitializeGeometry();
>>>>>>> +#endif
>>>>>>> }
>>>>>>> else
>>>>>>> {
>>>>>>> @@ -189,7 +208,11 @@ void GateRunManager::InitGeometryOnly()
>>>>>>>
>>>>>>> //----------------------------------------------------------
>>>>>>> ------------------------------
>>>>>>> void GateRunManager::InitPhysics()
>>>>>>> {
>>>>>>> + #ifdef G4MULTITHREADED
>>>>>>> + G4MTRunManager::InitializePhysics();
>>>>>>> +#else
>>>>>>> G4RunManager::InitializePhysics();
>>>>>>> +#endif
>>>>>>> }
>>>>>>>
>>>>>>> //----------------------------------------------------------
>>>>>>> ------------------------------
>>>>>>>
>>>>>>> @@ -205,7 +228,11 @@ void GateRunManager::RunInitialization()
>>>>>>>
>>>>>>> // GateMessage("Core", 0, "Initialization of the run " <<
>>>>>>> G4endl);
>>>>>>> // Perform a regular initialisation
>>>>>>> + #ifdef G4MULTITHREADED
>>>>>>> + G4MTRunManager::RunInitialization();
>>>>>>> +#else
>>>>>>> G4RunManager::RunInitialization();
>>>>>>> +#endif
>>>>>>>
>>>>>>> // Initialization of the atom deexcitation processes
>>>>>>> // must be done after all other initialization
>>>>>>>
>>>>>>> </end of the code>
>>>>>>>
>>>>>>> This patch compiles without any special warnings, however when I try
>>>>>>> to run it it explodes with the following message
>>>>>>>
>>>>>>> <start of message>
>>>>>>> [G4]
>>>>>>> [G4] *************************************************************
>>>>>>> [G4] Geant4 version Name: geant4-10-01 [MT] (5-December-2014)
>>>>>>> [G4] << in Multi-threaded mode >>
>>>>>>> [G4] Copyright : Geant4 Collaboration
>>>>>>> [G4] Reference : NIM A 506 (2003), 250-303
>>>>>>> [G4] WWW : http://cern.ch/geant4
>>>>>>> [G4] *************************************************************
>>>>>>> [G4]
>>>>>>> [G4-cerr]
>>>>>>> -------- EEEE ------- G4Exception-START -------- EEEE -------
>>>>>>> *** G4Exception : Run3011
>>>>>>> issued by : G4MTRunManager::SetUserAction()
>>>>>>> For multi-threaded version, define G4UserEventAction in
>>>>>>> G4VUserActionInitialization.
>>>>>>> *** Fatal Exception *** core dump ***
>>>>>>> -------- EEEE -------- G4Exception-END --------- EEEE -------
>>>>>>>
>>>>>>> [G4-cerr]
>>>>>>> [G4-cerr] *** G4Exception: Aborting execution ***
>>>>>>> Abortado (`core' generado)
>>>>>>> </end of message>
>>>>>>>
>>>>>>> So I wonder if some of you have ever face this situation and how to
>>>>>>> help
>>>>>>> me
>>>>>>>
>>>>>>> Best Regards
>>>>>>> Alex
>>>>>>> _______________________________________________
>>>>>>> Gate-users mailing list
>>>>>>> Gate-users at lists.opengatecollaboration.org
>>>>>>> http://lists.opengatecollaboration.org/mailman/listinfo/gate-users
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Gate-users mailing list
>>>>>> Gate-users at lists.opengatecollaboration.org
>>>>>> http://lists.opengatecollaboration.org/mailman/listinfo/gate-users
>>>>>>
>>>>>>
>>>>
> _______________________________________________
> Gate-users mailing list
> Gate-users at lists.opengatecollaboration.org
> http://lists.opengatecollaboration.org/mailman/listinfo/gate-users
>
--
David Sarrut, Phd
Directeur de recherche CNRS
CREATIS, UMR CNRS 5220, Inserm U 1044
Centre de lutte contre le cancer Léon Bérard
28 rue Laënnec, 69373 Lyon cedex 08
Tel : 04 78 78 51 51 / 06 74 72 05 42
http://www.creatis.insa-lyon.fr/~dsarrut
_________________________________
"2 + 2 = 5, for extremely large values of 2"
_________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opengatecollaboration.org/mailman/private/gate-users/attachments/20150401/aa39a4d2/attachment-0001.html>
More information about the Gate-users
mailing list