<div dir="ltr">Hello Marc and Alex, <div><br></div><div>first, thanks to Alex to facing this challenge, and also thanks Marc for advices !</div><div><br></div><div>As I said before, we currently have no ressources to do this job within the collaboration, but we follow what you are doing. </div><div><br></div><div>Good luck !</div><div>David</div><div>PS: I highly recommend to create a new branch in your git repository to keep track of all your changes. It will be the only proper way to integrate your code later. </div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Apr 1, 2015 at 9:49 AM, Marc Verderi <span dir="ltr"><<a href="mailto:verderi@in2p3.fr" target="_blank">verderi@in2p3.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Alex,<br>
<br>
The part you wrote in the GateActionInitialization class in fine to me. I suspect that the problem (please remember I don't know GATE code) may come from:<span class=""><br>
<br>
new GateUserActions( runManager, myRecords );<br>
<br></span>
given the runManager is passed to the class, it may set the event action internally, using the non-MT methods ?<br>
<br>
Cheers,<br>
Marc<div class="HOEnZb"><div class="h5"><br>
<br>
On 03/31/2015 08:55 PM, Alex Vergara Gil wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear Marc<br>
<br>
I've managed to update a few things, however the same message appears<br>
when running Gate, I can't manage myself to remove it.<br>
[G4-cerr]<br>
-------- EEEE ------- G4Exception-START -------- EEEE -------<br>
*** G4Exception : Run3011<br>
issued by : G4MTRunManager::SetUserAction(<u></u>)<br>
For multi-threaded version, define G4UserEventAction in<br>
G4VUserActionInitialization.<br>
*** Fatal Exception *** core dump ***<br>
-------- EEEE -------- G4Exception-END --------- EEEE -------<br>
<br>
[G4-cerr]<br>
[G4-cerr] *** G4Exception: Aborting execution ***<br>
Abortado (`core' generado)<br>
<br>
I suspect something is not initialized properly<br>
Regards<br>
<br>
Alex<br>
<br>
2015-03-31 12:09 GMT-04:00, Alex Vergara Gil <<a href="mailto:alexvergaragil@gmail.com" target="_blank">alexvergaragil@gmail.com</a>>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear Marc<br>
<br>
Thanks for your support, I will study these recommendations and let<br>
you know as soon as I get something new.<br>
<br>
Regards<br>
Alex<br>
<br>
2015-03-31 11:22 GMT-04:00, Marc Verderi <<a href="mailto:verderi@in2p3.fr" target="_blank">verderi@in2p3.fr</a>>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear Alex,<br>
<br>
Thank for your work and message. I see you got the bulk of the MT.<br>
I put below several things to look at / consider. Please note I know<br>
mostly nothing on Gate code itself...<br>
<br>
To summarize the issues, the "pCallbackMan" and "recorder"<br>
arguments in the action initialization need some design consideration as<br>
threads will very likely conflict on these objects. I would guess that<br>
the most significant issues will be here. Please see below for more<br>
details.<br>
<br>
I'll be happy to help more if I can, or involve some of the G4<br>
experts on MT if needed !<br>
<br>
Cheers,<br>
Marc<br>
<br>
<br>
o The lines:<br>
G4int nThreads = G4Threading::<u></u>G4GetNumberOfCores();<br>
runManager-><u></u>SetNumberOfThreads(nThreads); // Is equal to 2 by<br>
default<br>
<br>
are correct. Please note that you may use this number as a maximum<br>
number of threads, not the number by default.<br>
For debugging purpose, I would suggest to start with 2 cores only,<br>
and when the case looks clean, augment the number of cores.<br>
<br>
o The line:<br>
runManager-><u></u>SetUserInitialization( GatePhysicsList::GetInstance() );<br>
<br>
looks correct to me.<br>
One question is that if Gate had implemented "home made physics<br>
processes" (G4VProcess) in this physics list ? If so, they should comply<br>
to the new G4VProcess interface, which has methods for the MT case.<br>
<br>
o There are several lines with potential problems (I gather together<br>
lines concerned):<br>
// Set the Basic ROOT Output<br>
GateRecorderBase* myRecords = 0;<br>
<br>
--> ** ROOT is not thread safe ! ** For this reason, Geant4 is<br>
providing, in the "analysis" package, many, but not all, of the ROOT<br>
functionalities to create histograms and trees. The histograms are<br>
filled in each thread, and their content are merged at the end of the<br>
job. The trees are dumped individually by the threads (not merged) and<br>
should be analyzed using a chain.<br>
In a first stage, I would recommend to switch off the recording, to<br>
get rest of the machinery right, and then include the output<br>
functionalities.<br>
<br>
<br>
// Set the users actions to handle callback for actors - before the<br>
initialisation<br>
GateUserActions* myActions = new GateUserActions( runManager,<br>
myRecords );<br>
runManager-><u></u>SetUserInitialization( new GateActionInitialization(<br>
myActions, myRecords ) );<br>
and the constructor:<br>
GateActionInitialization(<u></u>GateUserActions * cbm, GateRecorderBase *<br>
r);<br>
with the lines in the GateActionInitialization class with arguments<br>
"pCallbackMan, recorder", specially in the Build() method.<br>
<br>
--> very likely this will not work. 'pCallbackMan' and 'recorder'<br>
are the same objects, share among the threads the way they are created<br>
and passed to the action initialization. What will happen is that they<br>
will be messaged at whatever times by the threads during the event loop<br>
-thread 1 is calling method a() and while a() is processed, method b()<br>
is called by thread 2, and thread 3 re-calls a(), still processed by<br>
thread 1; if inside these methods data members are changed, this will<br>
result in a unpredictable behavior-. The recorder, I understand is the<br>
ROOT based class, should be redesigned using "analysis" to avoid these<br>
conflicts and one instance of it (a priori) should be made per thread to<br>
make the recording independent among these threads.<br>
For pCallbackMan, I admit my ignorance. It looks to be a<br>
configuration class (correct ?) being read only at that time. Is this<br>
correct ? If so, this should not be too problematic. But certainly, some<br>
iteration is needed here.<br>
<br>
o At first sight, the rest looks fine to me. One comment is that the<br>
"action initialisation" mechanism works also for the usual G4RunManager,<br>
so that some #ifdef ... #endif could be removed. In the G4RunManager<br>
case, the BuildForMaster() is ignored.<br>
<br>
<br>
<br>
<br>
On 03/31/2015 02:57 PM, Alex Vergara Gil wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear All<br>
<br>
I have managed to create a patch that makes gate using G4MTRunManager,<br>
it compiles fine and run, but I am facing that it doesn't run in<br>
several threads, I need somebody who guide me in the right direction.<br>
<br>
Dear Marc<br>
<br>
Thank a lot for your suggestions they help me a lot in creating this<br>
patch. Can you or some G4 member take a look to this and see what is<br>
happening here.<br>
<br>
Regards<br>
Alex<br>
<br>
PS: Dear Marc, sorry for double mail you, I missed to check the<br>
respond to all tick.<br>
<br>
2015-03-26 9:31 GMT-04:00, Marc Verderi <<a href="mailto:verderi@in2p3.fr" target="_blank">verderi@in2p3.fr</a>>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear All,<br>
<br>
The interest of G4MTRunManager is that the geometry and the<br>
cross-section tables are shared among the threads. For big applications<br>
-and simulation of phantom irradiation is one example- this represents<br>
a<br>
large memory. For machines with many cores, spawning N jobs of such<br>
application may exhaust the memory, preventing to use all the available<br>
cores. By sharing geometry and cross-section tables, the G4MTRunManager<br>
saves a large fraction of memory, allowing to use much more cores. Some<br>
tests have been done by Geant4, on Xeon Phi, see for example<br>
<a href="https://twiki.cern.ch/twiki/bin/view/Geant4/MultiThreadingTaskForce#CPU_and_Memory_Performances" target="_blank">https://twiki.cern.ch/twiki/<u></u>bin/view/Geant4/<u></u>MultiThreadingTaskForce#CPU_<u></u>and_Memory_Performances</a><br>
and one single application of high energy physics type (simplified CMS<br>
simulation) could run smoothly with 240 threads, the maximum available<br>
(the machine has 60 user cores, up to 4 thread/core). Without MT, just<br>
spawning jobs, only ~30 jobs could have been run in parallel, leaving<br>
30<br>
cores unoccupied, because of lack of memory !<br>
<br>
Moving to multi-threading has some constraints. Each thread<br>
processes a bunch of events. Events are hence generated and processed<br>
in<br>
parallel, independently. This means that primary generator action,<br>
event<br>
action, stepping action have to have independent instances in each<br>
thread. This is the very purpose of the new class<br>
G4VUserActionInitialization : the method "Build()" is called for each<br>
thread, to instantiate in each of these the above actions. For the run<br>
action it is a bit more complicated : a run action may be for the<br>
entire<br>
application, or may be for each thread. For an "all application"<br>
action,<br>
BuildForMaster() has to be used.<br>
This independence of threads has a similar impact on sensitive<br>
detectors : for these, the G4VUserDetectorConstruction class has a new<br>
method : ConstructSDandField(). Again, this method is called for each<br>
thread, so that sensitive detectors and fields live independent lives<br>
in<br>
the various threads.<br>
This looks quite work, but is not that heavy in practice.<br>
<br>
In practice also, what has to be taken care of in your code are<br>
"static" variables : at each occurence of a static variable, you have<br>
to<br>
think if this variable has to be common to the entire application -a<br>
"true" static-, or if it is common the thread only : "thread local"<br>
static. In most of the cases, static variables are static to the<br>
thread.<br>
For the case of a true "static", be aware that this means that each<br>
thread may access the variable, at any time. If this variable is read &<br>
write during the processing, it will have a quite unpredictable<br>
behavior, and this is a source of debugging headaches ;) . Any random<br>
crash -which are often non-reproducible between two processing- signs<br>
this sort of conflict.<br>
<br>
Most of the G4 examples (basic, extended) are provided in MT<br>
mode,<br>
and are good starting points.<br>
<br>
Hope this helps.<br>
<br>
Cheers,<br>
Marc (a G4 member)<br>
<br>
<br>
On 03/26/2015 01:02 PM, Alex Vergara Gil wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Dear All<br>
<br>
I started this thread to unify all those enthusiast people who want to<br>
add G4MTRunManager support into GATE. The advantages of a Multi<br>
Threading Run Manager are obvious but I will explain it here anyway, I<br>
will send you also my first patch and the problems I am facing.<br>
<br>
Advantages<br>
1. You will not depend on an external cluster software to run on a<br>
single multi cpu PC.<br>
2. The time you need for simulation scales linearly with the number of<br>
cpus<br>
3. You doesn't need to merge the outputs, since this is performed<br>
automatically<br>
4. Any other you may add<br>
<br>
My first patch<br>
<br>
<start of the code><br>
<br>
Autor: Alex Vergara Gil <<a href="mailto:alexvergaragil@gmail.com" target="_blank">alexvergaragil@gmail.com</a>> 2015-03-25<br>
17:13:45<br>
Committer: Alex Vergara Gil <<a href="mailto:alexvergaragil@gmail.com" target="_blank">alexvergaragil@gmail.com</a>> 2015-03-25<br>
17:13:45<br>
Padre: db6875e64d60ad1e0f2d100c496843<u></u>632acb23c8 (Merge<br>
<a href="https://github.com/OpenGATE/Gate" target="_blank">https://github.com/OpenGATE/<u></u>Gate</a>)<br>
Hija: 28c338cd3263108df3927db14c6975<u></u>f4cdcc31b4 (Agregado el<br>
UserActionInitialization)<br>
Rama: partopc<br>
Sigue-a:<br>
Precede-a:<br>
<br>
trying g4mtRunManager<br>
<br>
------------------- source/general/include/<u></u>GateRunManager.hh<br>
-------------------<br>
index c4164d9..b72327b 100644<br>
@@ -28,12 +28,19 @@<br>
#define GateRunManager_h 1<br>
<br>
#include "G4RunManager.hh"<br>
+#ifdef G4MULTITHREADED<br>
+ #include "G4MTRunManager.hh"<br>
+#endif<br>
#include "<u></u>GateHounsfieldToMaterialsBuild<u></u>er.hh"<br>
<br>
class GateRunManagerMessenger;<br>
class GateDetectorConstruction;<br>
<br>
+#ifdef G4MULTITHREADED<br>
+class GateRunManager : public G4MTRunManager<br>
+#else<br>
class GateRunManager : public G4RunManager<br>
+#endif<br>
{<br>
public:<br>
//! Constructor<br>
@@ -60,8 +67,11 @@ public:<br>
<br>
//! Return the instance of the run manager<br>
static GateRunManager* GetRunManager()<br>
+ #ifdef G4MULTITHREADED<br>
+ { return<br>
dynamic_cast<GateRunManager*>(<u></u>G4MTRunManager::GetRunManager(<u></u>)); }<br>
+ #else<br>
{ return<br>
dynamic_cast<GateRunManager*>(<u></u>G4RunManager::GetRunManager())<u></u>;<br>
}<br>
-<br>
+ #endif<br>
bool GetGlobalOutputFlag() { return mGlobalOutputFlag; }<br>
void EnableGlobalOutput(bool b) { mGlobalOutputFlag = b; }<br>
void SetUserPhysicList(<u></u>G4VUserPhysicsList * m) { mUserPhysicList<br>
=<br>
m;<br>
}<br>
<br>
--------------------- source/general/src/<u></u>GateRunManager.cc<br>
---------------------<br>
index 2604e47..75b3fb5 100644<br>
@@ -8,6 +8,9 @@<br>
<br>
<br>
#include "GateRunManager.hh"<br>
+#ifdef G4MULTITHREADED<br>
+ #include "G4MTRunManager.hh"<br>
+#endif<br>
#include "GateDetectorConstruction.hh"<br>
#include "GateRunManagerMessenger.hh"<br>
#include "<u></u>GateHounsfieldToMaterialsBuild<u></u>er.hh"<br>
@@ -27,7 +30,11 @@<br>
#endif<br>
<br>
<br>
//----------------------------<u></u>------------------------------<u></u>------------------------------<br>
+#ifdef G4MULTITHREADED<br>
+GateRunManager::<u></u>GateRunManager():<u></u>G4MTRunManager()<br>
+#else<br>
GateRunManager::<u></u>GateRunManager():G4RunManager(<u></u>)<br>
+#endif<br>
{<br>
pMessenger = new GateRunManagerMessenger(this);<br>
mHounsfieldToMaterialsBuilder = new<br>
GateHounsfieldToMaterialsBuild<u></u>er();<br>
@@ -112,7 +119,11 @@ void GateRunManager::InitializeAll(<u></u>)<br>
<br>
G4ProductionCutsTable::<u></u>GetProductionCutsTable()-><u></u>GetHighEdgeEnergy());<br>
<br>
// Initialization<br>
+#ifdef G4MULTITHREADED<br>
+ G4MTRunManager::<u></u>SetUserInitialization(<u></u>mUserPhysicList);<br>
+#else<br>
G4RunManager::<u></u>SetUserInitialization(<u></u>mUserPhysicList);<br>
+#endif<br>
<br>
//To take into account the user cuts (steplimiter and special<br>
cuts)<br>
#if (G4VERSION_MAJOR > 9)<br>
@@ -126,7 +137,11 @@ void GateRunManager::InitializeAll(<u></u>)<br>
} // End if (mUserPhysicListName != "")<br>
<br>
// InitializePhysics<br>
+#ifdef G4MULTITHREADED<br>
G4RunManager::<u></u>InitializePhysics();<br>
+#else<br>
+ G4MTRunManager::<u></u>InitializePhysics();<br>
+#endif<br>
<br>
// Take into account the em option set by the user (dedx bin etc)<br>
GatePhysicsList::GetInstance()<u></u>->SetEmProcessOptions();<br>
@@ -169,7 +184,11 @@ void GateRunManager::<u></u>InitGeometryOnly()<br>
if (!geometryInitialized)<br>
{<br>
GateMessage("Core", 1, "Initialization of geometry" <<<br>
G4endl);<br>
+#ifdef G4MULTITHREADED<br>
+ G4MTRunManager::<u></u>InitializeGeometry();<br>
+#else<br>
G4RunManager::<u></u>InitializeGeometry();<br>
+#endif<br>
}<br>
else<br>
{<br>
@@ -189,7 +208,11 @@ void GateRunManager::<u></u>InitGeometryOnly()<br>
<br>
//----------------------------<u></u>------------------------------<u></u>------------------------------<br>
void GateRunManager::InitPhysics()<br>
{<br>
+ #ifdef G4MULTITHREADED<br>
+ G4MTRunManager::<u></u>InitializePhysics();<br>
+#else<br>
G4RunManager::<u></u>InitializePhysics();<br>
+#endif<br>
}<br>
<br>
//----------------------------<u></u>------------------------------<u></u>------------------------------<br>
<br>
@@ -205,7 +228,11 @@ void GateRunManager::<u></u>RunInitialization()<br>
<br>
// GateMessage("Core", 0, "Initialization of the run " <<<br>
G4endl);<br>
// Perform a regular initialisation<br>
+ #ifdef G4MULTITHREADED<br>
+ G4MTRunManager::<u></u>RunInitialization();<br>
+#else<br>
G4RunManager::<u></u>RunInitialization();<br>
+#endif<br>
<br>
// Initialization of the atom deexcitation processes<br>
// must be done after all other initialization<br>
<br>
</end of the code><br>
<br>
This patch compiles without any special warnings, however when I try<br>
to run it it explodes with the following message<br>
<br>
<start of message><br>
[G4]<br>
[G4] ******************************<u></u>******************************<u></u>*<br>
[G4] Geant4 version Name: geant4-10-01 [MT] (5-December-2014)<br>
[G4] << in Multi-threaded mode >><br>
[G4] Copyright : Geant4 Collaboration<br>
[G4] Reference : NIM A 506 (2003), 250-303<br>
[G4] WWW : <a href="http://cern.ch/geant4" target="_blank">http://cern.ch/geant4</a><br>
[G4] ******************************<u></u>******************************<u></u>*<br>
[G4]<br>
[G4-cerr]<br>
-------- EEEE ------- G4Exception-START -------- EEEE -------<br>
*** G4Exception : Run3011<br>
issued by : G4MTRunManager::SetUserAction(<u></u>)<br>
For multi-threaded version, define G4UserEventAction in<br>
G4VUserActionInitialization.<br>
*** Fatal Exception *** core dump ***<br>
-------- EEEE -------- G4Exception-END --------- EEEE -------<br>
<br>
[G4-cerr]<br>
[G4-cerr] *** G4Exception: Aborting execution ***<br>
Abortado (`core' generado)<br>
</end of message><br>
<br>
So I wonder if some of you have ever face this situation and how to<br>
help<br>
me<br>
<br>
Best Regards<br>
Alex<br>
______________________________<u></u>_________________<br>
Gate-users mailing list<br>
<a href="mailto:Gate-users@lists.opengatecollaboration.org" target="_blank">Gate-users@lists.<u></u>opengatecollaboration.org</a><br>
<a href="http://lists.opengatecollaboration.org/mailman/listinfo/gate-users" target="_blank">http://lists.<u></u>opengatecollaboration.org/<u></u>mailman/listinfo/gate-users</a><br>
</blockquote>
______________________________<u></u>_________________<br>
Gate-users mailing list<br>
<a href="mailto:Gate-users@lists.opengatecollaboration.org" target="_blank">Gate-users@lists.<u></u>opengatecollaboration.org</a><br>
<a href="http://lists.opengatecollaboration.org/mailman/listinfo/gate-users" target="_blank">http://lists.<u></u>opengatecollaboration.org/<u></u>mailman/listinfo/gate-users</a><br>
<br>
</blockquote></blockquote>
<br>
</blockquote></blockquote></blockquote>
<br>
______________________________<u></u>_________________<br>
Gate-users mailing list<br>
<a href="mailto:Gate-users@lists.opengatecollaboration.org" target="_blank">Gate-users@lists.<u></u>opengatecollaboration.org</a><br>
<a href="http://lists.opengatecollaboration.org/mailman/listinfo/gate-users" target="_blank">http://lists.<u></u>opengatecollaboration.org/<u></u>mailman/listinfo/gate-users</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr">David Sarrut, Phd<br>Directeur de recherche CNRS<br>CREATIS, UMR CNRS 5220, Inserm U 1044<div>Centre de lutte contre le cancer Léon Bérard<br>28 rue Laënnec, 69373 Lyon cedex 08<br>Tel : 04 78 78 51 51 / 06 74 72 05 42<br><a href="http://www.creatis.insa-lyon.fr/~dsarrut" target="_blank">http://www.creatis.insa-lyon.fr/~dsarrut</a><br>_________________________________</div><div> "2 + 2 = 5, for extremely large values of 2"<br>_________________________________</div></div></div>
</div>