[Gate-users] Adding G4MTRunManager Support to GATE
Alex Vergara Gil
alexvergaragil at gmail.com
Tue Mar 31 18:09:18 CEST 2015
Dear Marc
Thanks for your support, I will study these recommendations and let
you know as soon as I get something new.
Regards
Alex
2015-03-31 11:22 GMT-04:00, Marc Verderi <verderi at in2p3.fr>:
> Dear Alex,
>
> Thank for your work and message. I see you got the bulk of the MT.
> I put below several things to look at / consider. Please note I know
> mostly nothing on Gate code itself...
>
> To summarize the issues, the "pCallbackMan" and "recorder"
> arguments in the action initialization need some design consideration as
> threads will very likely conflict on these objects. I would guess that
> the most significant issues will be here. Please see below for more
> details.
>
> I'll be happy to help more if I can, or involve some of the G4
> experts on MT if needed !
>
> Cheers,
> Marc
>
>
> o The lines:
> G4int nThreads = G4Threading::G4GetNumberOfCores();
> runManager->SetNumberOfThreads(nThreads); // Is equal to 2 by default
>
> are correct. Please note that you may use this number as a maximum
> number of threads, not the number by default.
> For debugging purpose, I would suggest to start with 2 cores only,
> and when the case looks clean, augment the number of cores.
>
> o The line:
> runManager->SetUserInitialization( GatePhysicsList::GetInstance() );
>
> looks correct to me.
> One question is that if Gate had implemented "home made physics
> processes" (G4VProcess) in this physics list ? If so, they should comply
> to the new G4VProcess interface, which has methods for the MT case.
>
> o There are several lines with potential problems (I gather together
> lines concerned):
> // Set the Basic ROOT Output
> GateRecorderBase* myRecords = 0;
>
> --> ** ROOT is not thread safe ! ** For this reason, Geant4 is
> providing, in the "analysis" package, many, but not all, of the ROOT
> functionalities to create histograms and trees. The histograms are
> filled in each thread, and their content are merged at the end of the
> job. The trees are dumped individually by the threads (not merged) and
> should be analyzed using a chain.
> In a first stage, I would recommend to switch off the recording, to
> get rest of the machinery right, and then include the output
> functionalities.
>
>
> // Set the users actions to handle callback for actors - before the
> initialisation
> GateUserActions* myActions = new GateUserActions( runManager,
> myRecords );
> runManager->SetUserInitialization( new GateActionInitialization(
> myActions, myRecords ) );
> and the constructor:
> GateActionInitialization(GateUserActions * cbm, GateRecorderBase * r);
> with the lines in the GateActionInitialization class with arguments
> "pCallbackMan, recorder", specially in the Build() method.
>
> --> very likely this will not work. 'pCallbackMan' and 'recorder'
> are the same objects, share among the threads the way they are created
> and passed to the action initialization. What will happen is that they
> will be messaged at whatever times by the threads during the event loop
> -thread 1 is calling method a() and while a() is processed, method b()
> is called by thread 2, and thread 3 re-calls a(), still processed by
> thread 1; if inside these methods data members are changed, this will
> result in a unpredictable behavior-. The recorder, I understand is the
> ROOT based class, should be redesigned using "analysis" to avoid these
> conflicts and one instance of it (a priori) should be made per thread to
> make the recording independent among these threads.
> For pCallbackMan, I admit my ignorance. It looks to be a
> configuration class (correct ?) being read only at that time. Is this
> correct ? If so, this should not be too problematic. But certainly, some
> iteration is needed here.
>
> o At first sight, the rest looks fine to me. One comment is that the
> "action initialisation" mechanism works also for the usual G4RunManager,
> so that some #ifdef ... #endif could be removed. In the G4RunManager
> case, the BuildForMaster() is ignored.
>
>
>
>
> On 03/31/2015 02:57 PM, Alex Vergara Gil wrote:
>> Dear All
>>
>> I have managed to create a patch that makes gate using G4MTRunManager,
>> it compiles fine and run, but I am facing that it doesn't run in
>> several threads, I need somebody who guide me in the right direction.
>>
>> Dear Marc
>>
>> Thank a lot for your suggestions they help me a lot in creating this
>> patch. Can you or some G4 member take a look to this and see what is
>> happening here.
>>
>> Regards
>> Alex
>>
>> PS: Dear Marc, sorry for double mail you, I missed to check the
>> respond to all tick.
>>
>> 2015-03-26 9:31 GMT-04:00, Marc Verderi <verderi at in2p3.fr>:
>>> Dear All,
>>>
>>> The interest of G4MTRunManager is that the geometry and the
>>> cross-section tables are shared among the threads. For big applications
>>> -and simulation of phantom irradiation is one example- this represents a
>>> large memory. For machines with many cores, spawning N jobs of such
>>> application may exhaust the memory, preventing to use all the available
>>> cores. By sharing geometry and cross-section tables, the G4MTRunManager
>>> saves a large fraction of memory, allowing to use much more cores. Some
>>> tests have been done by Geant4, on Xeon Phi, see for example
>>> https://twiki.cern.ch/twiki/bin/view/Geant4/MultiThreadingTaskForce#CPU_and_Memory_Performances
>>> and one single application of high energy physics type (simplified CMS
>>> simulation) could run smoothly with 240 threads, the maximum available
>>> (the machine has 60 user cores, up to 4 thread/core). Without MT, just
>>> spawning jobs, only ~30 jobs could have been run in parallel, leaving 30
>>> cores unoccupied, because of lack of memory !
>>>
>>> Moving to multi-threading has some constraints. Each thread
>>> processes a bunch of events. Events are hence generated and processed in
>>> parallel, independently. This means that primary generator action, event
>>> action, stepping action have to have independent instances in each
>>> thread. This is the very purpose of the new class
>>> G4VUserActionInitialization : the method "Build()" is called for each
>>> thread, to instantiate in each of these the above actions. For the run
>>> action it is a bit more complicated : a run action may be for the entire
>>> application, or may be for each thread. For an "all application" action,
>>> BuildForMaster() has to be used.
>>> This independence of threads has a similar impact on sensitive
>>> detectors : for these, the G4VUserDetectorConstruction class has a new
>>> method : ConstructSDandField(). Again, this method is called for each
>>> thread, so that sensitive detectors and fields live independent lives in
>>> the various threads.
>>> This looks quite work, but is not that heavy in practice.
>>>
>>> In practice also, what has to be taken care of in your code are
>>> "static" variables : at each occurence of a static variable, you have to
>>> think if this variable has to be common to the entire application -a
>>> "true" static-, or if it is common the thread only : "thread local"
>>> static. In most of the cases, static variables are static to the thread.
>>> For the case of a true "static", be aware that this means that each
>>> thread may access the variable, at any time. If this variable is read &
>>> write during the processing, it will have a quite unpredictable
>>> behavior, and this is a source of debugging headaches ;) . Any random
>>> crash -which are often non-reproducible between two processing- signs
>>> this sort of conflict.
>>>
>>> Most of the G4 examples (basic, extended) are provided in MT mode,
>>> and are good starting points.
>>>
>>> Hope this helps.
>>>
>>> Cheers,
>>> Marc (a G4 member)
>>>
>>>
>>> On 03/26/2015 01:02 PM, Alex Vergara Gil wrote:
>>>> Dear All
>>>>
>>>> I started this thread to unify all those enthusiast people who want to
>>>> add G4MTRunManager support into GATE. The advantages of a Multi
>>>> Threading Run Manager are obvious but I will explain it here anyway, I
>>>> will send you also my first patch and the problems I am facing.
>>>>
>>>> Advantages
>>>> 1. You will not depend on an external cluster software to run on a
>>>> single multi cpu PC.
>>>> 2. The time you need for simulation scales linearly with the number of
>>>> cpus
>>>> 3. You doesn't need to merge the outputs, since this is performed
>>>> automatically
>>>> 4. Any other you may add
>>>>
>>>> My first patch
>>>>
>>>> <start of the code>
>>>>
>>>> Autor: Alex Vergara Gil <alexvergaragil at gmail.com> 2015-03-25 17:13:45
>>>> Committer: Alex Vergara Gil <alexvergaragil at gmail.com> 2015-03-25
>>>> 17:13:45
>>>> Padre: db6875e64d60ad1e0f2d100c496843632acb23c8 (Merge
>>>> https://github.com/OpenGATE/Gate)
>>>> Hija: 28c338cd3263108df3927db14c6975f4cdcc31b4 (Agregado el
>>>> UserActionInitialization)
>>>> Rama: partopc
>>>> Sigue-a:
>>>> Precede-a:
>>>>
>>>> trying g4mtRunManager
>>>>
>>>> ------------------- source/general/include/GateRunManager.hh
>>>> -------------------
>>>> index c4164d9..b72327b 100644
>>>> @@ -28,12 +28,19 @@
>>>> #define GateRunManager_h 1
>>>>
>>>> #include "G4RunManager.hh"
>>>> +#ifdef G4MULTITHREADED
>>>> + #include "G4MTRunManager.hh"
>>>> +#endif
>>>> #include "GateHounsfieldToMaterialsBuilder.hh"
>>>>
>>>> class GateRunManagerMessenger;
>>>> class GateDetectorConstruction;
>>>>
>>>> +#ifdef G4MULTITHREADED
>>>> +class GateRunManager : public G4MTRunManager
>>>> +#else
>>>> class GateRunManager : public G4RunManager
>>>> +#endif
>>>> {
>>>> public:
>>>> //! Constructor
>>>> @@ -60,8 +67,11 @@ public:
>>>>
>>>> //! Return the instance of the run manager
>>>> static GateRunManager* GetRunManager()
>>>> + #ifdef G4MULTITHREADED
>>>> + { return
>>>> dynamic_cast<GateRunManager*>(G4MTRunManager::GetRunManager()); }
>>>> + #else
>>>> { return
>>>> dynamic_cast<GateRunManager*>(G4RunManager::GetRunManager());
>>>> }
>>>> -
>>>> + #endif
>>>> bool GetGlobalOutputFlag() { return mGlobalOutputFlag; }
>>>> void EnableGlobalOutput(bool b) { mGlobalOutputFlag = b; }
>>>> void SetUserPhysicList(G4VUserPhysicsList * m) { mUserPhysicList =
>>>> m;
>>>> }
>>>>
>>>> --------------------- source/general/src/GateRunManager.cc
>>>> ---------------------
>>>> index 2604e47..75b3fb5 100644
>>>> @@ -8,6 +8,9 @@
>>>>
>>>>
>>>> #include "GateRunManager.hh"
>>>> +#ifdef G4MULTITHREADED
>>>> + #include "G4MTRunManager.hh"
>>>> +#endif
>>>> #include "GateDetectorConstruction.hh"
>>>> #include "GateRunManagerMessenger.hh"
>>>> #include "GateHounsfieldToMaterialsBuilder.hh"
>>>> @@ -27,7 +30,11 @@
>>>> #endif
>>>>
>>>>
>>>> //----------------------------------------------------------------------------------------
>>>> +#ifdef G4MULTITHREADED
>>>> +GateRunManager::GateRunManager():G4MTRunManager()
>>>> +#else
>>>> GateRunManager::GateRunManager():G4RunManager()
>>>> +#endif
>>>> {
>>>> pMessenger = new GateRunManagerMessenger(this);
>>>> mHounsfieldToMaterialsBuilder = new
>>>> GateHounsfieldToMaterialsBuilder();
>>>> @@ -112,7 +119,11 @@ void GateRunManager::InitializeAll()
>>>>
>>>> G4ProductionCutsTable::GetProductionCutsTable()->GetHighEdgeEnergy());
>>>>
>>>> // Initialization
>>>> +#ifdef G4MULTITHREADED
>>>> + G4MTRunManager::SetUserInitialization(mUserPhysicList);
>>>> +#else
>>>> G4RunManager::SetUserInitialization(mUserPhysicList);
>>>> +#endif
>>>>
>>>> //To take into account the user cuts (steplimiter and special
>>>> cuts)
>>>> #if (G4VERSION_MAJOR > 9)
>>>> @@ -126,7 +137,11 @@ void GateRunManager::InitializeAll()
>>>> } // End if (mUserPhysicListName != "")
>>>>
>>>> // InitializePhysics
>>>> +#ifdef G4MULTITHREADED
>>>> G4RunManager::InitializePhysics();
>>>> +#else
>>>> + G4MTRunManager::InitializePhysics();
>>>> +#endif
>>>>
>>>> // Take into account the em option set by the user (dedx bin etc)
>>>> GatePhysicsList::GetInstance()->SetEmProcessOptions();
>>>> @@ -169,7 +184,11 @@ void GateRunManager::InitGeometryOnly()
>>>> if (!geometryInitialized)
>>>> {
>>>> GateMessage("Core", 1, "Initialization of geometry" <<
>>>> G4endl);
>>>> +#ifdef G4MULTITHREADED
>>>> + G4MTRunManager::InitializeGeometry();
>>>> +#else
>>>> G4RunManager::InitializeGeometry();
>>>> +#endif
>>>> }
>>>> else
>>>> {
>>>> @@ -189,7 +208,11 @@ void GateRunManager::InitGeometryOnly()
>>>>
>>>> //----------------------------------------------------------------------------------------
>>>> void GateRunManager::InitPhysics()
>>>> {
>>>> + #ifdef G4MULTITHREADED
>>>> + G4MTRunManager::InitializePhysics();
>>>> +#else
>>>> G4RunManager::InitializePhysics();
>>>> +#endif
>>>> }
>>>>
>>>> //----------------------------------------------------------------------------------------
>>>>
>>>> @@ -205,7 +228,11 @@ void GateRunManager::RunInitialization()
>>>>
>>>> // GateMessage("Core", 0, "Initialization of the run " << G4endl);
>>>> // Perform a regular initialisation
>>>> + #ifdef G4MULTITHREADED
>>>> + G4MTRunManager::RunInitialization();
>>>> +#else
>>>> G4RunManager::RunInitialization();
>>>> +#endif
>>>>
>>>> // Initialization of the atom deexcitation processes
>>>> // must be done after all other initialization
>>>>
>>>> </end of the code>
>>>>
>>>> This patch compiles without any special warnings, however when I try
>>>> to run it it explodes with the following message
>>>>
>>>> <start of message>
>>>> [G4]
>>>> [G4] *************************************************************
>>>> [G4] Geant4 version Name: geant4-10-01 [MT] (5-December-2014)
>>>> [G4] << in Multi-threaded mode >>
>>>> [G4] Copyright : Geant4 Collaboration
>>>> [G4] Reference : NIM A 506 (2003), 250-303
>>>> [G4] WWW : http://cern.ch/geant4
>>>> [G4] *************************************************************
>>>> [G4]
>>>> [G4-cerr]
>>>> -------- EEEE ------- G4Exception-START -------- EEEE -------
>>>> *** G4Exception : Run3011
>>>> issued by : G4MTRunManager::SetUserAction()
>>>> For multi-threaded version, define G4UserEventAction in
>>>> G4VUserActionInitialization.
>>>> *** Fatal Exception *** core dump ***
>>>> -------- EEEE -------- G4Exception-END --------- EEEE -------
>>>>
>>>> [G4-cerr]
>>>> [G4-cerr] *** G4Exception: Aborting execution ***
>>>> Abortado (`core' generado)
>>>> </end of message>
>>>>
>>>> So I wonder if some of you have ever face this situation and how to
>>>> help
>>>> me
>>>>
>>>> Best Regards
>>>> Alex
>>>> _______________________________________________
>>>> Gate-users mailing list
>>>> Gate-users at lists.opengatecollaboration.org
>>>> http://lists.opengatecollaboration.org/mailman/listinfo/gate-users
>>> _______________________________________________
>>> Gate-users mailing list
>>> Gate-users at lists.opengatecollaboration.org
>>> http://lists.opengatecollaboration.org/mailman/listinfo/gate-users
>>>
>
>
More information about the Gate-users
mailing list