How to Reproduce Scaling Results #1184

DamynChipman · 2024-07-25T20:59:00Z

I am working on reproducing some scaling results on t8code for a JOSS review: openjournals/joss-reviews#6887.

I noticed the benchmarks directory, including the ExtremeScaling directory with the bunny example. My first question is this: is that a good "out of the box" problem for me to test the scaling of t8code with? If so, I have some questions below on how to run it. If not, what examples would be good candidates to verify the scaling?

For the bunny example, I am running into issues reading in a tetgen file. I have never worked with them, so my issue could simply be lack of experience. I have the bunny executable built. In order to run it, I need a tetgen file of the bunny mesh. I got the Stanford bunny mesh data from here. I copied the file into the benchmarks/ExtremeScaling directory and ran the t8_bunny example but to no avail.

Steps to reproduce

I am running this on a MacBook Pro (2021). Once I have it working, I'll repeat the process on the cluster I have access to. Each node of the cluster I will be running on has a 32 core AMD CPU and 4 NVIDIA GPUs. I will only be using the CPUs for this test.

Download Stanford bunny zipped file from here and unzip into ${downloads}.
Copy the bun_zipper.ply into the ExtremeScaling directory of t8code source (and rename to bunny):

cd ${t8code-source}/benchmarks/ExtremeScaling
cp ${downloads}/bunny/reconstruction/bun_zipper.ply ./
mv bun_zipper.ply bunny.ply

Run bunny benchmark:

mpirun -n 8 ./t8_bunny bunny

Output from above

>>> ./t8_bunny bunny
[libsc] This is libsc 2.8.5.406-2b20
[libsc] CPP                      
[libsc] CPPFLAGS                 
[libsc] CC                       mpicc
[libsc] CFLAGS                   -g -O2
[libsc] LDFLAGS                  
[libsc] LIBS                     -lz 
[p4est] This is p4est 2.8.6.23-7896
[p4est] CPP                      
[p4est] CPPFLAGS                 
[p4est] CC                       mpicc
[p4est] CFLAGS                   -g -O2
[p4est] LDFLAGS                  
[p4est] LIBS                     -lz 
[t8] This is t8 2.0.0.396-758c
[t8] CPP                      
[t8] CPPFLAGS                 
[t8] CC                       mpicc
[t8] CFLAGS                   -g -O2
[t8] LDFLAGS                  
[t8] LIBS                     -lz  -lstdc++
[p4est 0] Failed to open bunny.node
[p4est 0] Failed to read nodes for bunny
[libsc 0] Abort: Failed to read tetgen bunny
[libsc 0] Abort: <unknown>:0
[libsc 0] Abort: Obtained 7 stack frames
[libsc 0] Stack 0: 0   libsc.2.dylib                       0x0000000102766278 sc_abort_handler + 96
[libsc 0] Stack 1: 1   libsc.2.dylib                       0x0000000102766384 sc_abort + 20
[libsc 0] Stack 2: 2   libsc.2.dylib                       0x0000000102765d88 sc_int_compare + 0
[libsc 0] Stack 3: 3   libsc.2.dylib                       0x00000001027663dc sc_abort_collective + 0
[libsc 0] Stack 4: 4   libsc.2.dylib                       0x00000001027671a4 SC_GEN_LOGF + 0
[libsc 0] Stack 5: 5   t8_bunny                            0x00000001022cf81c main + 256
[libsc 0] Stack 6: 6   dyld                                0x0000000189265058 start + 2224
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[46390,0],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

The issue is the bunny.ply file not being the actual file I need. I appear to need a bunny.node file. How can I either generate that file from the Stanford bunny data or get that file from elsewhere?

The text was updated successfully, but these errors were encountered:

holke · 2024-08-07T12:43:24Z

Hi @DamynChipman
sorry for the late reply, it is holiday season and most of us are currently away.

To be honest, the bunny example is 8+ years old and relies on tetgen which we no longer support. So after seeing your issue we decided to remove it from the code base.

I understand that you want to have some benchmark program that you can run in order to see some scaling results for our core functionality?
If yes, then i suggest the program ./t8_time_forest_partition from the benchmarks folder.
We use it in several papers, and it is described particularly in https://arxiv.org/pdf/1910.10641 Section 5.2.
In this example we build a mesh geometry, uniformly refine it to a given level l, and then adaptively refine a band of elements, we move this band through the mesh over several time steps.
This example measures Adapt Partition Ghost and Balance, so all of our critical core algorithms.

I recommend using a call like:
mpirun -np N ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l4 -r3 -O -o -T0.05

This will build a 1 million element mesh, you can increase the -l value if thats to small and runs to quickly (on my machine its ca. 5 seconds on 1 MPI rank).

Here is an overview of the options:

Option	Description	recommendation
-g	Build Ghost layer	don't change
-b	2:1 balance the mesh	don't change
-C	CFL number/how fast the mesh moves	don't change
-x	Where the band of fine elements starts	decrease for more fine elements
-X	Where the band of fine elements stops	increase for more fine elements
-l	Uniform refinement level	every step multiplies number of elements by 8
-r	How many refinement levels from the uniform level	change at will
-O	Use cylindrical geometry	don't change
-o	Do not produce VTK output	Keep it while measuring runtime. For debugging/checking the mesh, leave it out.
-T	Simulation end time	Divide by two for each additional uniform level.

Here is a low dimensional example ("-l 2 -r 3") for the mesh that is created:

Does this help?

Please let me know if you have further questions.

DamynChipman · 2024-08-07T15:08:45Z

Yeah, this is super helpful, thanks! I will work through this and share results/issues along the way.

DamynChipman · 2024-08-27T22:14:37Z

Just to confirm, this example requires OpenCascade, correct? After compiling, I ran
mpirun -np N ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l4 -r3 -O -o -T0.05
but it stopped and said that example requires OpenCascade. I have been trying to install OpenCascade on my cluster but keep running into system issues not related to t8code.

jmark · 2024-08-29T14:37:46Z

I suggest to use -L instead of -O even though it is "not recommended" in the table above. This uses a cylinder with linear geometry (linear elements). This does not require t8code to be linked against OpenCascade.

DamynChipman · 2024-09-02T23:08:14Z

I want to double check to make sure it is running properly locally before moving to a cluster. The benchmark runs to completion and I can mess around with the parameters to increase/decrease the number of elements and view the mesh. However, I don't know if the results are being reported properly. I have used sc and p4est in the past, including the timing/stats functions. It appears that timing for adapt, ghost, partition, and balance are not being accumulated.

I am working on the August 1st commit (62128c7) to avoid a compilation issue reported here (#1240). There are no functional differences between main and this commit for benchmarks/time_forest_partition.cxx.

Here's the output of the following:

>>> mpirun -np 8 ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 4 -L -o -T 0.025

[libsc] This is libsc 2.8.5.999
[p4est] This is p4est 2.8.6.999
[t8] This is t8 2.0.0
[t8] CXX                      /opt/homebrew/bin/mpicxx
[t8] CXXFLAGS                  -O3 -DNDEBUG
[t8] CC                       /opt/homebrew/bin/mpicc
[t8] CFLAGS                    -O3 -DNDEBUG
[t8] LDFLAGS                  
[t8] LIBS                     P4EST::P4EST SC::SC MPI::MPI_C
[t8] Using delta_t = 0.032000
[t8] Committed cmesh with 4 global trees.
[t8] Start adadpt 0.002101 -0.002101
[t8] Into t8_forest_adapt from 16384 total elements
[t8] Done t8_forest_adapt with 4558320 total elements
[t8] End adadpt 0.199893 0.197792
[t8] Enter  forest partition.
[t8] Start partition 0.199979 0.199979
[t8] End partition 0.208866 0.008887
[t8] Done forest partition.
[t8] Into t8_forest_balance with 4558320 global elements.
[t8] Computed maximum occurring level:	8
[t8] Into t8_forest_ghost with 569790 local elements.
[t8] Start ghost at 0.226103  -0.226103
[t8] End ghost at 0.292601  0.066498
[t8] Done t8_forest_ghost with 569790 local elements and 15485 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.292621 -0.292621
[t8] Into t8_forest_adapt from 4558320 total elements
[t8] Done t8_forest_adapt with 4573216 total elements
[t8] End adadpt 0.354983 0.062362
[t8] Enter  forest partition.
[t8] Start partition 0.355051 0.355051
[t8] End partition 0.361946 0.006895
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 571652 local elements.
[t8] Start ghost at 0.364684  -0.364684
[t8] End ghost at 0.416118  0.051433
[t8] Done t8_forest_ghost with 571652 local elements and 15544 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.416141 -0.416141
[t8] Into t8_forest_adapt from 4573216 total elements
[t8] Done t8_forest_adapt with 4601440 total elements
[t8] End adadpt 0.498088 0.081947
[t8] Enter  forest partition.
[t8] Start partition 0.498677 0.498676
[t8] End partition 0.502745 0.004069
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 575180 local elements.
[t8] Start ghost at 0.505659  -0.505659
[t8] End ghost at 0.557156  0.051497
[t8] Done t8_forest_ghost with 575180 local elements and 15873 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.557175 -0.557175
[t8] Into t8_forest_adapt from 4601440 total elements
[t8] Done t8_forest_adapt with 4648256 total elements
[t8] End adadpt 0.675904 0.118729
[t8] Enter  forest partition.
[t8] Start partition 0.676512 0.676512
[t8] End partition 0.686310 0.009798
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 581032 local elements.
[t8] Start ghost at 0.688951  -0.688951
[t8] End ghost at 0.744801  0.055850
[t8] Done t8_forest_ghost with 581032 local elements and 16422 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.744819 -0.744819
[t8] Into t8_forest_adapt from 4648256 total elements
[t8] Done t8_forest_adapt with 4649376 total elements
[t8] End adadpt 0.863747 0.118927
[t8] Enter  forest partition.
[t8] Start partition 0.864383 0.864383
[t8] End partition 0.869593 0.005210
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 581172 local elements.
[t8] Start ghost at 0.871266  -0.871266
[t8] End ghost at 0.923508  0.052242
[t8] Done t8_forest_ghost with 581172 local elements and 16446 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.923524 -0.923524
[t8] Into t8_forest_adapt from 4649376 total elements
[t8] Done t8_forest_adapt with 4649376 total elements
[t8] End adadpt 1.051267 0.127743
[t8] Done t8_forest_balance with 4649376 global elements.
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0646732 (0.003 = 4.64%)
[t8]    Minimum attained at rank       3: 0.062135
[t8]    Maximum attained at rank       1: 0.068688
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0821344 (0.000428 = 0.521%)
[t8]    Minimum attained at rank       3: 0.081806
[t8]    Maximum attained at rank       6: 0.083232
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.118677 (0.00042 = 0.354%)
[t8]    Minimum attained at rank       2: 0.118279
[t8]    Maximum attained at rank       6: 0.119743
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.119126 (0.00111 = 0.931%)
[t8]    Minimum attained at rank       2: 0.118523
[t8]    Maximum attained at rank       6: 0.122047
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.127708 (0.000561 = 0.439%)
[t8]    Minimum attained at rank       2: 0.127265
[t8]    Maximum attained at rank       6: 0.129152
[t8] Statistics for   forest balance: Total adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.512319 (0.00478 = 0.933%)
[t8]    Minimum attained at rank       2: 0.508419
[t8]    Maximum attained at rank       6: 0.522614
[t8] Summary = [ 0.0646732 0.0821344 0.118677 0.119126 0.127708 0.512319 ];
[t8] Maximum = [ 0.068688 0.083232 0.119743 0.122047 0.129152 0.522614 ];
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0512449 (0.000435 = 0.848%)
[t8]    Minimum attained at rank       6: 0.050128
[t8]    Maximum attained at rank       3: 0.051583
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0515504 (0.000421 = 0.816%)
[t8]    Minimum attained at rank       6: 0.050481
[t8]    Maximum attained at rank       2: 0.051946
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0556621 (0.00111 = 2%)
[t8]    Minimum attained at rank       6: 0.052731
[t8]    Maximum attained at rank       2: 0.056257
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0522777 (0.000561 = 1.07%)
[t8]    Minimum attained at rank       6: 0.050832
[t8]    Maximum attained at rank       2: 0.05272
[t8] Statistics for   forest balance: Total ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.210735 (0.00251 = 1.19%)
[t8]    Minimum attained at rank       6: 0.204172
[t8]    Maximum attained at rank       2: 0.212393
[t8] Summary = [ 0.0512449 0.0515504 0.0556621 0.0522777 0.210735 ];
[t8] Maximum = [ 0.051583 0.051946 0.056257 0.05272 0.212393 ];
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00686963 (0.00138 = 20%)
[t8]    Minimum attained at rank       3: 0.005146
[t8]    Maximum attained at rank       1: 0.009283
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00521925 (0.000772 = 14.8%)
[t8]    Minimum attained at rank       0: 0.004069
[t8]    Maximum attained at rank       2: 0.00628
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00729638 (0.00281 = 38.5%)
[t8]    Minimum attained at rank       7: 0.004249
[t8]    Maximum attained at rank       2: 0.011448
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00519713 (0.000547 = 10.5%)
[t8]    Minimum attained at rank       5: 0.004715
[t8]    Maximum attained at rank       2: 0.006559
[t8] Statistics for   forest balance: Total partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0245824 (0.00407 = 16.6%)
[t8]    Minimum attained at rank       5: 0.0211
[t8]    Maximum attained at rank       1: 0.031333
[t8] Summary = [ 0.00686963 0.00521925 0.00729638 0.00519713 0.0245824 ];
[t8] Maximum = [ 0.009283 0.00628 0.011448 0.006559 0.031333 ];
[t8] Into t8_forest_ghost with 581172 local elements.
[t8] Start ghost at 1.055855  -1.055855
[t8] End ghost at 1.108176  0.052320
[t8] Done t8_forest_ghost with 581172 local elements and 16446 ghost elements.
[t8] Printing stats for cmesh.
[t8] Statistics for   cmesh: Number of trees sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of ghosts sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of trees received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of ghosts received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of bytes sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of processes sent to.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: First tree is shared.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           -8 (0 = 0%)
[t8]    Minimum attained at rank       0: -8
[t8]    Maximum attained at rank       0: -8
[t8] Statistics for   cmesh: Partition runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Commit runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           3.1e-05 (2.47e-05 = 79.7%)
[t8]    Minimum attained at rank       6: 4e-06
[t8]    Maximum attained at rank       0: 8e-05
[t8] Statistics for   cmesh: Number of geometry evaluations.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           651125 (7.44e+05 = 114%)
[t8]    Minimum attained at rank       0: 2304
[t8]    Maximum attained at rank       4: 1.81606e+06
[t8] Statistics for   cmesh: Accumulated geometry evaluation runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0189933 (0.0216 = 114%)
[t8]    Minimum attained at rank       6: 8e-05
[t8]    Maximum attained at rank       5: 0.052864
[t8] Summary = [ 0 0 0 0 0 0 -8 0 3.1e-05 651125 0.0189933 ];
[t8] Maximum = [ 0 0 0 0 0 0 -8 0 8e-05 1.81606e+06 0.052864 ];
[t8] Printing stats for forest.
[t8] Statistics for   forest: Number of elements sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           441347 (4.62e+05 = 105%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       5: 1.13548e+06
[t8] Statistics for   forest: Number of elements received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           441347 (2.23e+05 = 50.6%)
[t8]    Minimum attained at rank       4: 0
[t8]    Maximum attained at rank       1: 569790
[t8] Statistics for   forest: Number of bytes sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.05924e+07 (1.11e+07 = 105%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       5: 2.72517e+07
[t8] Statistics for   forest: Number of processes sent to.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.375 (0.992 = 72.2%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       4: 3
[t8] Statistics for   forest: Number of ghost elements sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           33532.8 (6.59e+03 = 19.6%)
[t8]    Minimum attained at rank       0: 18758
[t8]    Maximum attained at rank       6: 39696
[t8] Statistics for   forest: Number of ghost elements received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           33532.8 (7.09e+03 = 21.1%)
[t8]    Minimum attained at rank       0: 16446
[t8]    Maximum attained at rank       4: 39682
[t8] Statistics for   forest: Number of processes we sent ghosts to/received from.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Adapt runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Partition runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0146836 (0.00525 = 35.8%)
[t8]    Minimum attained at rank       1: 0.008687
[t8]    Maximum attained at rank       3: 0.023082
[t8] Statistics for   forest: Commit runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.855875 (1.79e-06 = 0.000209%)
[t8]    Minimum attained at rank       3: 0.855872
[t8]    Maximum attained at rank       4: 0.855877
[t8] Statistics for   forest: Ghost runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0523034 (0.00076 = 1.45%)
[t8]    Minimum attained at rank       6: 0.050317
[t8]    Maximum attained at rank       2: 0.052796
[t8] Statistics for   forest: Ghost waittime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Balance runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.831332 (0.000132 = 0.0159%)
[t8]    Minimum attained at rank       1: 0.831124
[t8]    Maximum attained at rank       2: 0.831623
[t8] Statistics for   forest: Balance rounds.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           5 (0 = 0%)
[t8]    Minimum attained at rank       0: 5
[t8]    Maximum attained at rank       0: 5
[t8] Summary = [ 441347 441347 1.05924e+07 1.375 33532.8 33532.8 0 0 0.0146836 0.855875 0.0523034 0 0.831332 5 ];
[t8] Maximum = [ 1.13548e+06 569790 2.72517e+07 3 39696 39682 0 0 0.023082 0.855877 0.052796 0 0.831623 5 ];
[t8] Statistics for   new
[t8]    Global number of values:      16
[t8]    Mean value (std. dev.):           0.00203125 (3.23e-05 = 1.59%)
[t8]    Minimum attained at rank       0: 0.002
[t8]    Maximum attained at rank       5: 0.002106
[t8] Statistics for   adapt
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   ghost
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   partition
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   balance
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   total
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.10902 (0.000273 = 0.0246%)
[t8]    Minimum attained at rank       1: 1.10871
[t8]    Maximum attained at rank       5: 1.10947
[t8] Summary = [ 0.00203125 0 0 0 0 1.10902 ];
[t8] Maximum = [ 0.002106 0 0 0 0 1.10947 ];

Am I right in seeing that the stats are not being accumulated and reported? Or are they being reported in the forest balance: <Adapt, Ghost, Partition> time sections? What needs to change in order to accumulate/report the timing for these stages?

DamynChipman · 2024-09-04T17:25:30Z

Just confirming that PR #1242 fixed the compilation issue and I have reran the same commands as above with main with the same results. I want to make sure I get the right timing results for this benchmark to correctly represent and reproduce the scalability of t8code.

DamynChipman · 2024-09-09T19:27:22Z

Small Scale Results

Device: 2021 MacBook Pro (CPU: M1 Pro, RAM: 32GB)
Command:

mpirun -np $n ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 5 -L -o -T 0.025

Section	1	2	4	8
Adapt	12.4669	7.41635	4.03027	3.37253
Ghost	0	2.26506	1.77662	1.53293
Partition	1.40002	0.753114	0.370952	0.283714
Total	18.8554	14.8982	10.3473	7.95105

Comments

Running this on my laptop to confirm input parameters prior to submitting larger batch job on cluster.

Note that I got the timing results reported above from the [t8] Statistics for forest balance: Total <adapt, ghost, partition> time sections as indicated in the comments above. If this is not the proper time to be reporting, please let me know ASAP.

DamynChipman · 2024-09-24T15:10:27Z

Should I be running into memory issues with this benchmark? This runs just fine up to 32 MPI ranks but beyond that the benchmark terminates with the following:

>>> cat output-n64.txt
[libsc] This is libsc 2.8.5.999
[p4est] This is p4est 2.8.6.999
[t8] This is t8 2.0.0
[t8] CXX                      /opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicxx
[t8] CXXFLAGS                  -fast -O3 -DNDEBUG
[t8] CC                       /opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicc
[t8] CFLAGS                    -fast -O3 -DNDEBUG
[t8] LDFLAGS                  
[t8] LIBS                     P4EST::P4EST SC::SC MPI::MPI_C
[t8] Using delta_t = 0.032000
[t8] Committed cmesh with 4 global trees.
[t8] Start adadpt 1282.875929 -1282.875929
[t8] Into t8_forest_adapt from 16384 total elements
[t8] Done t8_forest_adapt with 36264176 total elements
[t8] End adadpt 1283.927255 1.051326
[t8] Enter  forest partition.
[t8] Start partition 1283.932301 1283.932301
[libsc 8] Caught signal SEGV
[libsc 12] Caught signal SEGV
[libsc 20] Caught signal SEGV
[libsc 9] Caught signal SEGV
[libsc 22] Caught signal SEGV
[libsc 13] Caught signal SEGV
[libsc 23] Caught signal SEGV
[libsc 40] Caught signal SEGV
[libsc 31] Abort: Returned NULL from malloc
[libsc 31] Abort: /home/dchipman1/packages/t8code/sc/src/sc.c:398
[libsc 50] Caught signal SEGV
[libsc 43] Caught signal SEGV
[libsc 52] Caught signal SEGV
[libsc 49] Caught signal SEGV
[libsc 54] Caught signal SEGV
[libsc 53] Caught signal SEGV
[libsc 55] Caught signal SEGV
[libsc 8] Abort: Obtained 11 stack frames
[libsc 8] Stack 0: libsc.so.2.0.0(+0xda15) [0x154d9a2a1a15]
[libsc 8] Stack 1: libsc.so.2.0.0(+0xb69c) [0x154d9a29f69c]
[libsc 8] Stack 2: libc.so.6(+0x4adc0) [0x154d95a53dc0]
[libsc 8] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x154d9a48ec83]
[libsc 8] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x154d9a491863]
[libsc 8] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x154d9a491129]
[libsc 8] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x154d9a49e5fb]
[libsc 8] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x154d9a49e624]
[libsc 8] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 8] Stack 9: libc.so.6(__libc_start_main+0xef) [0x154d95a3e24d]
[libsc 8] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 51] Caught signal SEGV
[libsc 12] Abort: Obtained 11 stack frames
[libsc 12] Stack 0: libsc.so.2.0.0(+0xda15) [0x1511a2570a15]
[libsc 12] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1511a256e69c]
[libsc 12] Stack 2: libc.so.6(+0x4adc0) [0x15119de53dc0]
[libsc 12] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1511a275dc83]
[libsc 12] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1511a2760863]
[libsc 12] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1511a2760129]
[libsc 12] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1511a276d5fb]
[libsc 12] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1511a276d624]
[libsc 12] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 12] Stack 9: libc.so.6(__libc_start_main+0xef) [0x15119de3e24d]
[libsc 12] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 20] Abort: Obtained 11 stack frames
[libsc 20] Stack 0: libsc.so.2.0.0(+0xda15) [0x1550de5f3a15]
[libsc 20] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1550de5f169c]
[libsc 20] Stack 2: libc.so.6(+0x4adc0) [0x1550d9e53dc0]
[libsc 20] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1550de7e0c83]
[libsc 20] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1550de7e3863]
[libsc 20] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1550de7e3129]
[libsc 20] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1550de7f05fb]
[libsc 20] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1550de7f0624]
[libsc 20] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 20] Stack 9: libc.so.6(__libc_start_main+0xef) [0x1550d9e3e24d]
[libsc 20] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 22] Abort: Obtained 11 stack frames
[libsc 22] Stack 0: libsc.so.2.0.0(+0xda15) [0x14e968acfa15]
[libsc 22] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14e968acd69c]
[libsc 22] Stack 2: libc.so.6(+0x4adc0) [0x14e964253dc0]
[libsc 22] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14e968cbcc83]
[libsc 22] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14e968cbf863]
[libsc 22] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14e968cbf129]
[libsc 22] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14e968ccc5fb]
[libsc 22] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14e968ccc624]
[libsc 22] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 22] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14e96423e24d]
[libsc 22] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 40] Abort: Obtained 11 stack frames
[libsc 40] Stack 0: libsc.so.2.0.0(+0xda15) [0x147214034a15]
[libsc 40] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14721403269c]
[libsc 40] Stack 2: libc.so.6(+0x4adc0) [0x14720f853dc0]
[libsc 40] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x147214221c83]
[libsc 40] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x147214224863]
[libsc 40] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x147214224129]
[libsc 40] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1472142315fb]
[libsc 40] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x147214231624]
[libsc 40] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 40] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14720f83e24d]
[libsc 40] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 50] Abort: Obtained 11 stack frames
[libsc 50] Stack 0: libsc.so.2.0.0(+0xda15) [0x14d7bdcd4a15]
[libsc 50] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14d7bdcd269c]
[libsc 50] Stack 2: libc.so.6(+0x4adc0) [0x14d7b9453dc0]
[libsc 50] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14d7bdec1c83]
[libsc 50] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14d7bdec4863]
[libsc 50] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14d7bdec4129]
[libsc 50] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14d7bded15fb]
[libsc 50] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14d7bded1624]
[libsc 50] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 50] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14d7b943e24d]
[libsc 50] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 52] Abort: Obtained 11 stack frames
[libsc 52] Stack 0: libsc.so.2.0.0(+0xda15) [0x14e11f0e7a15]
[libsc 52] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14e11f0e569c]
[libsc 52] Stack 2: libc.so.6(+0x4adc0) [0x14e11a853dc0]
[libsc 52] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14e11f2d4c83]
[libsc 52] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14e11f2d7863]
[libsc 52] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14e11f2d7129]
[libsc 52] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14e11f2e45fb]
[libsc 52] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14e11f2e4624]
[libsc 52] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 52] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14e11a83e24d]
[libsc 52] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 54] Abort: Obtained 11 stack frames
[libsc 54] Stack 0: libsc.so.2.0.0(+0xda15) [0x15090ce9da15]
[libsc 54] Stack 1: libsc.so.2.0.0(+0xb69c) [0x15090ce9b69c]
[libsc 54] Stack 2: libc.so.6(+0x4adc0) [0x150908653dc0]
[libsc 54] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x15090d08ac83]
[libsc 54] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x15090d08d863]
[libsc 54] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x15090d08d129]
[libsc 54] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x15090d09a5fb]
[libsc 54] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x15090d09a624]
[libsc 54] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 54] Stack 9: libc.so.6(__libc_start_main+0xef) [0x15090863e24d]
[libsc 54] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 13] Abort: Obtained 11 stack frames
[libsc 13] Stack 0: libsc.so.2.0.0(+0xda15) [0x14699e0dea15]
[libsc 13] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14699e0dc69c]
[libsc 13] Stack 2: libc.so.6(+0x4adc0) [0x146999853dc0]
[libsc 13] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14699e2cbc83]
[libsc 13] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14699e2ce863]
[libsc 13] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14699e2ce129]
[libsc 13] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14699e2db5fb]
[libsc 13] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14699e2db624]
[libsc 13] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 13] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14699983e24d]
[libsc 13] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 49] Abort: Obtained 11 stack frames
[libsc 49] Stack 0: libsc.so.2.0.0(+0xda15) [0x1456674e7a15]
[libsc 49] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1456674e569c]
[libsc 49] Stack 2: libc.so.6(+0x4adc0) [0x145662c53dc0]
[libsc 49] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1456676d4c83]
[libsc 49] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1456676d7863]
[libsc 49] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1456676d7129]
[libsc 49] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1456676e45fb]
[libsc 49] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1456676e4624]
[libsc 49] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 49] Stack 9: libc.so.6(__libc_start_main+0xef) [0x145662c3e24d]
[libsc 49] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 51] Abort: Obtained 11 stack frames
[libsc 51] Stack 0: libsc.so.2.0.0(+0xda15) [0x1477798b8a15]
[libsc 51] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1477798b669c]
[libsc 51] Stack 2: libc.so.6(+0x4adc0) [0x147775053dc0]
[libsc 51] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x147779aa5c83]
[libsc 51] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x147779aa8863]
[libsc 51] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x147779aa8129]
[libsc 51] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x147779ab55fb]
[libsc 51] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x147779ab5624]
[libsc 51] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 51] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14777503e24d]
[libsc 51] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 9] Abort: Obtained 11 stack frames
[libsc 9] Stack 0: libsc.so.2.0.0(+0xda15) [0x14fdae2e9a15]
[libsc 9] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14fdae2e769c]
[libsc 9] Stack 2: libc.so.6(+0x4adc0) [0x14fda9a53dc0]
[libsc 9] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14fdae4d6c83]
[libsc 9] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14fdae4d9863]
[libsc 9] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14fdae4d9129]
[libsc 9] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14fdae4e65fb]
[libsc 9] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14fdae4e6624]
[libsc 9] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 9] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14fda9a3e24d]
[libsc 9] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 23] Abort: Obtained 11 stack frames
[libsc 23] Stack 0: libsc.so.2.0.0(+0xda15) [0x1468a180ba15]
[libsc 23] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1468a180969c]
[libsc 23] Stack 2: libc.so.6(+0x4adc0) [0x14689d053dc0]
[libsc 23] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1468a19f8c83]
[libsc 23] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1468a19fb863]
[libsc 23] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1468a19fb129]
[libsc 23] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1468a1a085fb]
[libsc 23] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1468a1a08624]
[libsc 23] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 23] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14689d03e24d]
[libsc 23] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 31] Abort: Obtained 9 stack frames
[libsc 31] Stack 0: libsc.so.2.0.0(+0xda15) [0x1455fdf41a15]
[libsc 31] Stack 1: libsc.so.2.0.0(sc_calloc+0x180) [0x1455fdf3fbc0]
[libsc 31] Stack 2: libt8.so.2.0.0-982-gce8365c89(+0x625c0) [0x1455fe1315c0]
[libsc 31] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1455fe131129]
[libsc 31] Stack 4: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1455fe13e5fb]
[libsc 31] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1455fe13e624]
[libsc 31] Stack 6: t8_time_forest_partition() [0x402a86]
[libsc 31] Stack 7: libc.so.6(__libc_start_main+0xef) [0x1455f983e24d]
[libsc 31] Stack 8: t8_time_forest_partition() [0x401dba]
[libsc 43] Abort: Obtained 11 stack frames
[libsc 43] Stack 0: libsc.so.2.0.0(+0xda15) [0x1495bf257a15]
[libsc 43] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1495bf25569c]
[libsc 43] Stack 2: libc.so.6(+0x4adc0) [0x1495baa53dc0]
[libsc 43] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1495bf444c83]
[libsc 43] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1495bf447863]
[libsc 43] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1495bf447129]
[libsc 43] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1495bf4545fb]
[libsc 43] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1495bf454624]
[libsc 43] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 43] Stack 9: libc.so.6(__libc_start_main+0xef) [0x1495baa3e24d]
[libsc 43] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 53] Abort: Obtained 11 stack frames
[libsc 53] Stack 0: libsc.so.2.0.0(+0xda15) [0x14f93c60ea15]
[libsc 53] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14f93c60c69c]
[libsc 53] Stack 2: libc.so.6(+0x4adc0) [0x14f937e53dc0]
[libsc 53] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14f93c7fbc83]
[libsc 53] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14f93c7fe863]
[libsc 53] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14f93c7fe129]
[libsc 53] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14f93c80b5fb]
[libsc 53] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14f93c80b624]
[libsc 53] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 53] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14f937e3e24d]
[libsc 53] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 55] Abort: Obtained 11 stack frames
[libsc 55] Stack 0: libsc.so.2.0.0(+0xda15) [0x149470c74a15]
[libsc 55] Stack 1: libsc.so.2.0.0(+0xb69c) [0x149470c7269c]
[libsc 55] Stack 2: libc.so.6(+0x4adc0) [0x14946c453dc0]
[libsc 55] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x149470e61c83]
[libsc 55] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x149470e64863]
[libsc 55] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x149470e64129]
[libsc 55] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x149470e715fb]
[libsc 55] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x149470e71624]
[libsc 55] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 55] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14946c43e24d]
[libsc 55] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 18] Caught signal SEGV
[libsc 18] Abort: Obtained 11 stack frames
[libsc 18] Stack 0: libsc.so.2.0.0(+0xda15) [0x148e60e83a15]
[libsc 18] Stack 1: libsc.so.2.0.0(+0xb69c) [0x148e60e8169c]
[libsc 18] Stack 2: libc.so.6(+0x4adc0) [0x148e5c653dc0]
[libsc 18] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x148e61070c83]
[libsc 18] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x148e61073863]
[libsc 18] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x148e61073129]
[libsc 18] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x148e610805fb]
[libsc 18] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x148e61080624]
[libsc 18] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 18] Stack 9: libc.so.6(__libc_start_main+0xef) [0x148e5c63e24d]
[libsc 18] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 19] Caught signal SEGV
[libsc 19] Abort: Obtained 11 stack frames
[libsc 19] Stack 0: libsc.so.2.0.0(+0xda15) [0x146707ad3a15]
[libsc 19] Stack 1: libsc.so.2.0.0(+0xb69c) [0x146707ad169c]
[libsc 19] Stack 2: libc.so.6(+0x4adc0) [0x146703253dc0]
[libsc 19] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x146707cc0c83]
[libsc 19] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x146707cc3863]
[libsc 19] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x146707cc3129]
[libsc 19] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x146707cd05fb]
[libsc 19] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x146707cd0624]
[libsc 19] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 19] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14670323e24d]
[libsc 19] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 39] Caught signal SEGV
[libsc 39] Abort: Obtained 11 stack frames
[libsc 39] Stack 0: libsc.so.2.0.0(+0xda15) [0x149ef404ea15]
[libsc 39] Stack 1: libsc.so.2.0.0(+0xb69c) [0x149ef404c69c]
[libsc 39] Stack 2: libc.so.6(+0x4adc0) [0x149eef853dc0]
[libsc 39] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x149ef423bc83]
[libsc 39] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x149ef423e863]
[libsc 39] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x149ef423e129]
[libsc 39] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x149ef424b5fb]
[libsc 39] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x149ef424b624]
[libsc 39] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 39] Stack 9: libc.so.6(__libc_start_main+0xef) [0x149eef83e24d]
[libsc 39] Stack 10: t8_time_forest_partition() [0x401dba]

Davknapp · 2024-09-27T08:40:41Z

Hey @DamynChipman ,
can you provide us the input parameters for this run, to reproduce it?
In general you shouldn't have any problems running this example on a cluster, we have tested it using much more than 32 procs. But this has been a couple of years ago, so we will double check again, especially with your input parameters.

One part of your output might indicate that you just ran out of memory:

[libsc 31] Abort: Returned NULL from malloc
[libsc 31] Abort: /home/dchipman1/packages/t8code/sc/src/sc.c:398

Maybe your parameter-combination produced a mesh that was to large to handle for the machine?

jmark · 2024-10-10T12:40:25Z

@DamynChipman I tried to run this benchmark example on my machine with 64 ranks. It runs just fine.

mpirun -n 64 ~/install/t8code/main/bin/t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 5 -L -o -T 0.025

I use MPICH version 4.0.2 and gcc version 12.1.0 on a Linux Kernel 6.8.0-45-generic and 22.04.1-Ubuntu.

DamynChipman · 2024-10-24T00:12:33Z

Large (ish) Scale Results

Machine: Falcon (Dual Intel Xeon 18 core nodes)
Command:

mpirun -n $N ./t8_time_forest_partition -g -b -C 0.8 -x -0.2 -X 0.2 -l 4 -r 4 -L -o -T 0.025 >> output_n$N.txt

Results:

	1	2	4	8	16	32	64	128	256	512
Adapt [sec]	22.198	15.3656	4.88433	3.51135	2.74351	1.75711	1.25502	0.832045	0.963431	2.83504
Ghost [sec]	0	4.92692	2.57878	2.4864	1.56907	0.939954	0.627435	0.385915	0.289718	0.207101
Partition [sec]	3.50545	1.78216	0.728757	0.433819	0.198721	0.152427	0.0674145	0.0380946	0.0197075	0.00966559
Total [sec]	45.7854	35.9366	12.8784	10.5972	6.90435	4.21385	3.01467	2.16236	3.01856	4.97634
Total Elements	34723392	34723392	34723392	34723392	34723392	34723392	34723392	34723392	34723392	34723392

Comments

Turns out the issue was just the machine I was running on (it's a rather picky cluster that is currently being worked on as a decommissioned national lab computer). I was able to run up to 512 MPI ranks with runs beyond that failing due to the machine configuration. For the purposes of reproducing this benchmark, I am satisfied and impressed with the speed and memory footprint for over 30M elements!

Thank you for your guidance on reproducing this result!

holke · 2024-10-24T07:43:11Z

Thank you for the update and the praise ;)
We are happy that we could resolve your questions.

holke self-assigned this Aug 5, 2024

holke added examples Edits in our examples discussion labels Aug 5, 2024

This was referenced Aug 12, 2024

Remove bunny example #1209

Closed

New wiki-entry about reproducing our scaling results #1210

Closed

DamynChipman closed this as completed Oct 24, 2024

DamynChipman mentioned this issue Oct 24, 2024

[REVIEW]: t8code - modular adaptive mesh refinement in the exascale era openjournals/joss-reviews#6887

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Reproduce Scaling Results #1184

How to Reproduce Scaling Results #1184

DamynChipman commented Jul 25, 2024

holke commented Aug 7, 2024 •

edited

Loading

DamynChipman commented Aug 7, 2024

DamynChipman commented Aug 27, 2024

jmark commented Aug 29, 2024

DamynChipman commented Sep 2, 2024

DamynChipman commented Sep 4, 2024

DamynChipman commented Sep 9, 2024

DamynChipman commented Sep 24, 2024

Davknapp commented Sep 27, 2024

jmark commented Oct 10, 2024

DamynChipman commented Oct 24, 2024

holke commented Oct 24, 2024

How to Reproduce Scaling Results #1184

How to Reproduce Scaling Results #1184

Comments

DamynChipman commented Jul 25, 2024

Steps to reproduce

Output from above

holke commented Aug 7, 2024 • edited Loading

DamynChipman commented Aug 7, 2024

DamynChipman commented Aug 27, 2024

jmark commented Aug 29, 2024

DamynChipman commented Sep 2, 2024

DamynChipman commented Sep 4, 2024

DamynChipman commented Sep 9, 2024

Small Scale Results

Comments

DamynChipman commented Sep 24, 2024

Davknapp commented Sep 27, 2024

jmark commented Oct 10, 2024

DamynChipman commented Oct 24, 2024

Large (ish) Scale Results

Comments

holke commented Oct 24, 2024

holke commented Aug 7, 2024 •

edited

Loading