Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Reproduce Scaling Results #1184

Closed
DamynChipman opened this issue Jul 25, 2024 · 12 comments
Closed

How to Reproduce Scaling Results #1184

DamynChipman opened this issue Jul 25, 2024 · 12 comments
Assignees
Labels
discussion examples Edits in our examples

Comments

@DamynChipman
Copy link

I am working on reproducing some scaling results on t8code for a JOSS review: openjournals/joss-reviews#6887.

I noticed the benchmarks directory, including the ExtremeScaling directory with the bunny example. My first question is this: is that a good "out of the box" problem for me to test the scaling of t8code with? If so, I have some questions below on how to run it. If not, what examples would be good candidates to verify the scaling?

For the bunny example, I am running into issues reading in a tetgen file. I have never worked with them, so my issue could simply be lack of experience. I have the bunny executable built. In order to run it, I need a tetgen file of the bunny mesh. I got the Stanford bunny mesh data from here. I copied the file into the benchmarks/ExtremeScaling directory and ran the t8_bunny example but to no avail.

Steps to reproduce

I am running this on a MacBook Pro (2021). Once I have it working, I'll repeat the process on the cluster I have access to. Each node of the cluster I will be running on has a 32 core AMD CPU and 4 NVIDIA GPUs. I will only be using the CPUs for this test.

  1. Download Stanford bunny zipped file from here and unzip into ${downloads}.

  2. Copy the bun_zipper.ply into the ExtremeScaling directory of t8code source (and rename to bunny):

cd ${t8code-source}/benchmarks/ExtremeScaling
cp ${downloads}/bunny/reconstruction/bun_zipper.ply ./
mv bun_zipper.ply bunny.ply
  1. Run bunny benchmark:
mpirun -n 8 ./t8_bunny bunny

Output from above

>>> ./t8_bunny bunny
[libsc] This is libsc 2.8.5.406-2b20
[libsc] CPP                      
[libsc] CPPFLAGS                 
[libsc] CC                       mpicc
[libsc] CFLAGS                   -g -O2
[libsc] LDFLAGS                  
[libsc] LIBS                     -lz 
[p4est] This is p4est 2.8.6.23-7896
[p4est] CPP                      
[p4est] CPPFLAGS                 
[p4est] CC                       mpicc
[p4est] CFLAGS                   -g -O2
[p4est] LDFLAGS                  
[p4est] LIBS                     -lz 
[t8] This is t8 2.0.0.396-758c
[t8] CPP                      
[t8] CPPFLAGS                 
[t8] CC                       mpicc
[t8] CFLAGS                   -g -O2
[t8] LDFLAGS                  
[t8] LIBS                     -lz  -lstdc++
[p4est 0] Failed to open bunny.node
[p4est 0] Failed to read nodes for bunny
[libsc 0] Abort: Failed to read tetgen bunny
[libsc 0] Abort: <unknown>:0
[libsc 0] Abort: Obtained 7 stack frames
[libsc 0] Stack 0: 0   libsc.2.dylib                       0x0000000102766278 sc_abort_handler + 96
[libsc 0] Stack 1: 1   libsc.2.dylib                       0x0000000102766384 sc_abort + 20
[libsc 0] Stack 2: 2   libsc.2.dylib                       0x0000000102765d88 sc_int_compare + 0
[libsc 0] Stack 3: 3   libsc.2.dylib                       0x00000001027663dc sc_abort_collective + 0
[libsc 0] Stack 4: 4   libsc.2.dylib                       0x00000001027671a4 SC_GEN_LOGF + 0
[libsc 0] Stack 5: 5   t8_bunny                            0x00000001022cf81c main + 256
[libsc 0] Stack 6: 6   dyld                                0x0000000189265058 start + 2224
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
  Proc: [[46390,0],0]
  Errorcode: 1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

The issue is the bunny.ply file not being the actual file I need. I appear to need a bunny.node file. How can I either generate that file from the Stanford bunny data or get that file from elsewhere?

@holke holke self-assigned this Aug 5, 2024
@holke holke added examples Edits in our examples discussion labels Aug 5, 2024
@holke
Copy link
Collaborator

holke commented Aug 7, 2024

Hi @DamynChipman
sorry for the late reply, it is holiday season and most of us are currently away.

To be honest, the bunny example is 8+ years old and relies on tetgen which we no longer support. So after seeing your issue we decided to remove it from the code base.

I understand that you want to have some benchmark program that you can run in order to see some scaling results for our core functionality?
If yes, then i suggest the program ./t8_time_forest_partition from the benchmarks folder.
We use it in several papers, and it is described particularly in https://arxiv.org/pdf/1910.10641 Section 5.2.
In this example we build a mesh geometry, uniformly refine it to a given level l, and then adaptively refine a band of elements, we move this band through the mesh over several time steps.
This example measures Adapt Partition Ghost and Balance, so all of our critical core algorithms.

I recommend using a call like:
mpirun -np N ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l4 -r3 -O -o -T0.05

This will build a 1 million element mesh, you can increase the -l value if thats to small and runs to quickly (on my machine its ca. 5 seconds on 1 MPI rank).

Here is an overview of the options:

Option Description recommendation
-g Build Ghost layer don't change
-b 2:1 balance the mesh don't change
-C CFL number/how fast the mesh moves don't change
-x Where the band of fine elements starts decrease for more fine elements
-X Where the band of fine elements stops increase for more fine elements
-l Uniform refinement level every step multiplies number of elements by 8
-r How many refinement levels from the uniform level change at will
-O Use cylindrical geometry don't change
-o Do not produce VTK output Keep it while measuring runtime. For debugging/checking the mesh, leave it out.
-T Simulation end time Divide by two for each additional uniform level.

Here is a low dimensional example ("-l 2 -r 3") for the mesh that is created:

grafik

Does this help?

Please let me know if you have further questions.

@DamynChipman
Copy link
Author

Yeah, this is super helpful, thanks! I will work through this and share results/issues along the way.

@DamynChipman
Copy link
Author

Just to confirm, this example requires OpenCascade, correct? After compiling, I ran
mpirun -np N ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l4 -r3 -O -o -T0.05
but it stopped and said that example requires OpenCascade. I have been trying to install OpenCascade on my cluster but keep running into system issues not related to t8code.

@jmark
Copy link
Collaborator

jmark commented Aug 29, 2024

I suggest to use -L instead of -O even though it is "not recommended" in the table above. This uses a cylinder with linear geometry (linear elements). This does not require t8code to be linked against OpenCascade.

@DamynChipman
Copy link
Author

I want to double check to make sure it is running properly locally before moving to a cluster. The benchmark runs to completion and I can mess around with the parameters to increase/decrease the number of elements and view the mesh. However, I don't know if the results are being reported properly. I have used sc and p4est in the past, including the timing/stats functions. It appears that timing for adapt, ghost, partition, and balance are not being accumulated.

I am working on the August 1st commit (62128c7) to avoid a compilation issue reported here (#1240). There are no functional differences between main and this commit for benchmarks/time_forest_partition.cxx.

Here's the output of the following:

>>> mpirun -np 8 ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 4 -L -o -T 0.025
[libsc] This is libsc 2.8.5.999
[p4est] This is p4est 2.8.6.999
[t8] This is t8 2.0.0
[t8] CXX                      /opt/homebrew/bin/mpicxx
[t8] CXXFLAGS                  -O3 -DNDEBUG
[t8] CC                       /opt/homebrew/bin/mpicc
[t8] CFLAGS                    -O3 -DNDEBUG
[t8] LDFLAGS                  
[t8] LIBS                     P4EST::P4EST SC::SC MPI::MPI_C
[t8] Using delta_t = 0.032000
[t8] Committed cmesh with 4 global trees.
[t8] Start adadpt 0.002101 -0.002101
[t8] Into t8_forest_adapt from 16384 total elements
[t8] Done t8_forest_adapt with 4558320 total elements
[t8] End adadpt 0.199893 0.197792
[t8] Enter  forest partition.
[t8] Start partition 0.199979 0.199979
[t8] End partition 0.208866 0.008887
[t8] Done forest partition.
[t8] Into t8_forest_balance with 4558320 global elements.
[t8] Computed maximum occurring level:	8
[t8] Into t8_forest_ghost with 569790 local elements.
[t8] Start ghost at 0.226103  -0.226103
[t8] End ghost at 0.292601  0.066498
[t8] Done t8_forest_ghost with 569790 local elements and 15485 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.292621 -0.292621
[t8] Into t8_forest_adapt from 4558320 total elements
[t8] Done t8_forest_adapt with 4573216 total elements
[t8] End adadpt 0.354983 0.062362
[t8] Enter  forest partition.
[t8] Start partition 0.355051 0.355051
[t8] End partition 0.361946 0.006895
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 571652 local elements.
[t8] Start ghost at 0.364684  -0.364684
[t8] End ghost at 0.416118  0.051433
[t8] Done t8_forest_ghost with 571652 local elements and 15544 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.416141 -0.416141
[t8] Into t8_forest_adapt from 4573216 total elements
[t8] Done t8_forest_adapt with 4601440 total elements
[t8] End adadpt 0.498088 0.081947
[t8] Enter  forest partition.
[t8] Start partition 0.498677 0.498676
[t8] End partition 0.502745 0.004069
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 575180 local elements.
[t8] Start ghost at 0.505659  -0.505659
[t8] End ghost at 0.557156  0.051497
[t8] Done t8_forest_ghost with 575180 local elements and 15873 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.557175 -0.557175
[t8] Into t8_forest_adapt from 4601440 total elements
[t8] Done t8_forest_adapt with 4648256 total elements
[t8] End adadpt 0.675904 0.118729
[t8] Enter  forest partition.
[t8] Start partition 0.676512 0.676512
[t8] End partition 0.686310 0.009798
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 581032 local elements.
[t8] Start ghost at 0.688951  -0.688951
[t8] End ghost at 0.744801  0.055850
[t8] Done t8_forest_ghost with 581032 local elements and 16422 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.744819 -0.744819
[t8] Into t8_forest_adapt from 4648256 total elements
[t8] Done t8_forest_adapt with 4649376 total elements
[t8] End adadpt 0.863747 0.118927
[t8] Enter  forest partition.
[t8] Start partition 0.864383 0.864383
[t8] End partition 0.869593 0.005210
[t8] Done forest partition.
[t8] Into t8_forest_ghost with 581172 local elements.
[t8] Start ghost at 0.871266  -0.871266
[t8] End ghost at 0.923508  0.052242
[t8] Done t8_forest_ghost with 581172 local elements and 16446 ghost elements.
[t8] Profiling: 1
[t8] Start adadpt 0.923524 -0.923524
[t8] Into t8_forest_adapt from 4649376 total elements
[t8] Done t8_forest_adapt with 4649376 total elements
[t8] End adadpt 1.051267 0.127743
[t8] Done t8_forest_balance with 4649376 global elements.
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0646732 (0.003 = 4.64%)
[t8]    Minimum attained at rank       3: 0.062135
[t8]    Maximum attained at rank       1: 0.068688
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0821344 (0.000428 = 0.521%)
[t8]    Minimum attained at rank       3: 0.081806
[t8]    Maximum attained at rank       6: 0.083232
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.118677 (0.00042 = 0.354%)
[t8]    Minimum attained at rank       2: 0.118279
[t8]    Maximum attained at rank       6: 0.119743
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.119126 (0.00111 = 0.931%)
[t8]    Minimum attained at rank       2: 0.118523
[t8]    Maximum attained at rank       6: 0.122047
[t8] Statistics for   forest balance: Adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.127708 (0.000561 = 0.439%)
[t8]    Minimum attained at rank       2: 0.127265
[t8]    Maximum attained at rank       6: 0.129152
[t8] Statistics for   forest balance: Total adapt time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.512319 (0.00478 = 0.933%)
[t8]    Minimum attained at rank       2: 0.508419
[t8]    Maximum attained at rank       6: 0.522614
[t8] Summary = [ 0.0646732 0.0821344 0.118677 0.119126 0.127708 0.512319 ];
[t8] Maximum = [ 0.068688 0.083232 0.119743 0.122047 0.129152 0.522614 ];
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0512449 (0.000435 = 0.848%)
[t8]    Minimum attained at rank       6: 0.050128
[t8]    Maximum attained at rank       3: 0.051583
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0515504 (0.000421 = 0.816%)
[t8]    Minimum attained at rank       6: 0.050481
[t8]    Maximum attained at rank       2: 0.051946
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0556621 (0.00111 = 2%)
[t8]    Minimum attained at rank       6: 0.052731
[t8]    Maximum attained at rank       2: 0.056257
[t8] Statistics for   forest balance: Ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0522777 (0.000561 = 1.07%)
[t8]    Minimum attained at rank       6: 0.050832
[t8]    Maximum attained at rank       2: 0.05272
[t8] Statistics for   forest balance: Total ghost time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.210735 (0.00251 = 1.19%)
[t8]    Minimum attained at rank       6: 0.204172
[t8]    Maximum attained at rank       2: 0.212393
[t8] Summary = [ 0.0512449 0.0515504 0.0556621 0.0522777 0.210735 ];
[t8] Maximum = [ 0.051583 0.051946 0.056257 0.05272 0.212393 ];
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00686963 (0.00138 = 20%)
[t8]    Minimum attained at rank       3: 0.005146
[t8]    Maximum attained at rank       1: 0.009283
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00521925 (0.000772 = 14.8%)
[t8]    Minimum attained at rank       0: 0.004069
[t8]    Maximum attained at rank       2: 0.00628
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00729638 (0.00281 = 38.5%)
[t8]    Minimum attained at rank       7: 0.004249
[t8]    Maximum attained at rank       2: 0.011448
[t8] Statistics for   forest balance: Partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.00519713 (0.000547 = 10.5%)
[t8]    Minimum attained at rank       5: 0.004715
[t8]    Maximum attained at rank       2: 0.006559
[t8] Statistics for   forest balance: Total partition time
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0245824 (0.00407 = 16.6%)
[t8]    Minimum attained at rank       5: 0.0211
[t8]    Maximum attained at rank       1: 0.031333
[t8] Summary = [ 0.00686963 0.00521925 0.00729638 0.00519713 0.0245824 ];
[t8] Maximum = [ 0.009283 0.00628 0.011448 0.006559 0.031333 ];
[t8] Into t8_forest_ghost with 581172 local elements.
[t8] Start ghost at 1.055855  -1.055855
[t8] End ghost at 1.108176  0.052320
[t8] Done t8_forest_ghost with 581172 local elements and 16446 ghost elements.
[t8] Printing stats for cmesh.
[t8] Statistics for   cmesh: Number of trees sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of ghosts sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of trees received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of ghosts received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of bytes sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Number of processes sent to.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: First tree is shared.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           -8 (0 = 0%)
[t8]    Minimum attained at rank       0: -8
[t8]    Maximum attained at rank       0: -8
[t8] Statistics for   cmesh: Partition runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   cmesh: Commit runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           3.1e-05 (2.47e-05 = 79.7%)
[t8]    Minimum attained at rank       6: 4e-06
[t8]    Maximum attained at rank       0: 8e-05
[t8] Statistics for   cmesh: Number of geometry evaluations.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           651125 (7.44e+05 = 114%)
[t8]    Minimum attained at rank       0: 2304
[t8]    Maximum attained at rank       4: 1.81606e+06
[t8] Statistics for   cmesh: Accumulated geometry evaluation runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0189933 (0.0216 = 114%)
[t8]    Minimum attained at rank       6: 8e-05
[t8]    Maximum attained at rank       5: 0.052864
[t8] Summary = [ 0 0 0 0 0 0 -8 0 3.1e-05 651125 0.0189933 ];
[t8] Maximum = [ 0 0 0 0 0 0 -8 0 8e-05 1.81606e+06 0.052864 ];
[t8] Printing stats for forest.
[t8] Statistics for   forest: Number of elements sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           441347 (4.62e+05 = 105%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       5: 1.13548e+06
[t8] Statistics for   forest: Number of elements received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           441347 (2.23e+05 = 50.6%)
[t8]    Minimum attained at rank       4: 0
[t8]    Maximum attained at rank       1: 569790
[t8] Statistics for   forest: Number of bytes sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.05924e+07 (1.11e+07 = 105%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       5: 2.72517e+07
[t8] Statistics for   forest: Number of processes sent to.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.375 (0.992 = 72.2%)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       4: 3
[t8] Statistics for   forest: Number of ghost elements sent.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           33532.8 (6.59e+03 = 19.6%)
[t8]    Minimum attained at rank       0: 18758
[t8]    Maximum attained at rank       6: 39696
[t8] Statistics for   forest: Number of ghost elements received.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           33532.8 (7.09e+03 = 21.1%)
[t8]    Minimum attained at rank       0: 16446
[t8]    Maximum attained at rank       4: 39682
[t8] Statistics for   forest: Number of processes we sent ghosts to/received from.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Adapt runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Partition runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0146836 (0.00525 = 35.8%)
[t8]    Minimum attained at rank       1: 0.008687
[t8]    Maximum attained at rank       3: 0.023082
[t8] Statistics for   forest: Commit runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.855875 (1.79e-06 = 0.000209%)
[t8]    Minimum attained at rank       3: 0.855872
[t8]    Maximum attained at rank       4: 0.855877
[t8] Statistics for   forest: Ghost runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.0523034 (0.00076 = 1.45%)
[t8]    Minimum attained at rank       6: 0.050317
[t8]    Maximum attained at rank       2: 0.052796
[t8] Statistics for   forest: Ghost waittime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   forest: Balance runtime.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0.831332 (0.000132 = 0.0159%)
[t8]    Minimum attained at rank       1: 0.831124
[t8]    Maximum attained at rank       2: 0.831623
[t8] Statistics for   forest: Balance rounds.
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           5 (0 = 0%)
[t8]    Minimum attained at rank       0: 5
[t8]    Maximum attained at rank       0: 5
[t8] Summary = [ 441347 441347 1.05924e+07 1.375 33532.8 33532.8 0 0 0.0146836 0.855875 0.0523034 0 0.831332 5 ];
[t8] Maximum = [ 1.13548e+06 569790 2.72517e+07 3 39696 39682 0 0 0.023082 0.855877 0.052796 0 0.831623 5 ];
[t8] Statistics for   new
[t8]    Global number of values:      16
[t8]    Mean value (std. dev.):           0.00203125 (3.23e-05 = 1.59%)
[t8]    Minimum attained at rank       0: 0.002
[t8]    Maximum attained at rank       5: 0.002106
[t8] Statistics for   adapt
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   ghost
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   partition
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   balance
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           0 (0)
[t8]    Minimum attained at rank       0: 0
[t8]    Maximum attained at rank       0: 0
[t8] Statistics for   total
[t8]    Global number of values:       8
[t8]    Mean value (std. dev.):           1.10902 (0.000273 = 0.0246%)
[t8]    Minimum attained at rank       1: 1.10871
[t8]    Maximum attained at rank       5: 1.10947
[t8] Summary = [ 0.00203125 0 0 0 0 1.10902 ];
[t8] Maximum = [ 0.002106 0 0 0 0 1.10947 ];

Am I right in seeing that the stats are not being accumulated and reported? Or are they being reported in the forest balance: <Adapt, Ghost, Partition> time sections? What needs to change in order to accumulate/report the timing for these stages?

@DamynChipman
Copy link
Author

Just confirming that PR #1242 fixed the compilation issue and I have reran the same commands as above with main with the same results. I want to make sure I get the right timing results for this benchmark to correctly represent and reproduce the scalability of t8code.

@DamynChipman
Copy link
Author

Small Scale Results

Device: 2021 MacBook Pro (CPU: M1 Pro, RAM: 32GB)
Command:

mpirun -np $n ./t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 5 -L -o -T 0.025
Section 1 2 4 8
Adapt 12.4669 7.41635 4.03027 3.37253
Ghost 0 2.26506 1.77662 1.53293
Partition 1.40002 0.753114 0.370952 0.283714
Total 18.8554 14.8982 10.3473 7.95105

Comments

Running this on my laptop to confirm input parameters prior to submitting larger batch job on cluster.

Note that I got the timing results reported above from the [t8] Statistics for forest balance: Total <adapt, ghost, partition> time sections as indicated in the comments above. If this is not the proper time to be reporting, please let me know ASAP.

@DamynChipman
Copy link
Author

Should I be running into memory issues with this benchmark? This runs just fine up to 32 MPI ranks but beyond that the benchmark terminates with the following:

>>> cat output-n64.txt
[libsc] This is libsc 2.8.5.999
[p4est] This is p4est 2.8.6.999
[t8] This is t8 2.0.0
[t8] CXX                      /opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicxx
[t8] CXXFLAGS                  -fast -O3 -DNDEBUG
[t8] CC                       /opt/cray/pe/mpich/8.1.28/ofi/nvidia/23.3/bin/mpicc
[t8] CFLAGS                    -fast -O3 -DNDEBUG
[t8] LDFLAGS                  
[t8] LIBS                     P4EST::P4EST SC::SC MPI::MPI_C
[t8] Using delta_t = 0.032000
[t8] Committed cmesh with 4 global trees.
[t8] Start adadpt 1282.875929 -1282.875929
[t8] Into t8_forest_adapt from 16384 total elements
[t8] Done t8_forest_adapt with 36264176 total elements
[t8] End adadpt 1283.927255 1.051326
[t8] Enter  forest partition.
[t8] Start partition 1283.932301 1283.932301
[libsc 8] Caught signal SEGV
[libsc 12] Caught signal SEGV
[libsc 20] Caught signal SEGV
[libsc 9] Caught signal SEGV
[libsc 22] Caught signal SEGV
[libsc 13] Caught signal SEGV
[libsc 23] Caught signal SEGV
[libsc 40] Caught signal SEGV
[libsc 31] Abort: Returned NULL from malloc
[libsc 31] Abort: /home/dchipman1/packages/t8code/sc/src/sc.c:398
[libsc 50] Caught signal SEGV
[libsc 43] Caught signal SEGV
[libsc 52] Caught signal SEGV
[libsc 49] Caught signal SEGV
[libsc 54] Caught signal SEGV
[libsc 53] Caught signal SEGV
[libsc 55] Caught signal SEGV
[libsc 8] Abort: Obtained 11 stack frames
[libsc 8] Stack 0: libsc.so.2.0.0(+0xda15) [0x154d9a2a1a15]
[libsc 8] Stack 1: libsc.so.2.0.0(+0xb69c) [0x154d9a29f69c]
[libsc 8] Stack 2: libc.so.6(+0x4adc0) [0x154d95a53dc0]
[libsc 8] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x154d9a48ec83]
[libsc 8] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x154d9a491863]
[libsc 8] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x154d9a491129]
[libsc 8] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x154d9a49e5fb]
[libsc 8] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x154d9a49e624]
[libsc 8] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 8] Stack 9: libc.so.6(__libc_start_main+0xef) [0x154d95a3e24d]
[libsc 8] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 51] Caught signal SEGV
[libsc 12] Abort: Obtained 11 stack frames
[libsc 12] Stack 0: libsc.so.2.0.0(+0xda15) [0x1511a2570a15]
[libsc 12] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1511a256e69c]
[libsc 12] Stack 2: libc.so.6(+0x4adc0) [0x15119de53dc0]
[libsc 12] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1511a275dc83]
[libsc 12] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1511a2760863]
[libsc 12] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1511a2760129]
[libsc 12] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1511a276d5fb]
[libsc 12] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1511a276d624]
[libsc 12] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 12] Stack 9: libc.so.6(__libc_start_main+0xef) [0x15119de3e24d]
[libsc 12] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 20] Abort: Obtained 11 stack frames
[libsc 20] Stack 0: libsc.so.2.0.0(+0xda15) [0x1550de5f3a15]
[libsc 20] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1550de5f169c]
[libsc 20] Stack 2: libc.so.6(+0x4adc0) [0x1550d9e53dc0]
[libsc 20] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1550de7e0c83]
[libsc 20] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1550de7e3863]
[libsc 20] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1550de7e3129]
[libsc 20] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1550de7f05fb]
[libsc 20] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1550de7f0624]
[libsc 20] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 20] Stack 9: libc.so.6(__libc_start_main+0xef) [0x1550d9e3e24d]
[libsc 20] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 22] Abort: Obtained 11 stack frames
[libsc 22] Stack 0: libsc.so.2.0.0(+0xda15) [0x14e968acfa15]
[libsc 22] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14e968acd69c]
[libsc 22] Stack 2: libc.so.6(+0x4adc0) [0x14e964253dc0]
[libsc 22] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14e968cbcc83]
[libsc 22] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14e968cbf863]
[libsc 22] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14e968cbf129]
[libsc 22] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14e968ccc5fb]
[libsc 22] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14e968ccc624]
[libsc 22] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 22] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14e96423e24d]
[libsc 22] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 40] Abort: Obtained 11 stack frames
[libsc 40] Stack 0: libsc.so.2.0.0(+0xda15) [0x147214034a15]
[libsc 40] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14721403269c]
[libsc 40] Stack 2: libc.so.6(+0x4adc0) [0x14720f853dc0]
[libsc 40] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x147214221c83]
[libsc 40] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x147214224863]
[libsc 40] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x147214224129]
[libsc 40] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1472142315fb]
[libsc 40] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x147214231624]
[libsc 40] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 40] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14720f83e24d]
[libsc 40] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 50] Abort: Obtained 11 stack frames
[libsc 50] Stack 0: libsc.so.2.0.0(+0xda15) [0x14d7bdcd4a15]
[libsc 50] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14d7bdcd269c]
[libsc 50] Stack 2: libc.so.6(+0x4adc0) [0x14d7b9453dc0]
[libsc 50] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14d7bdec1c83]
[libsc 50] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14d7bdec4863]
[libsc 50] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14d7bdec4129]
[libsc 50] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14d7bded15fb]
[libsc 50] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14d7bded1624]
[libsc 50] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 50] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14d7b943e24d]
[libsc 50] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 52] Abort: Obtained 11 stack frames
[libsc 52] Stack 0: libsc.so.2.0.0(+0xda15) [0x14e11f0e7a15]
[libsc 52] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14e11f0e569c]
[libsc 52] Stack 2: libc.so.6(+0x4adc0) [0x14e11a853dc0]
[libsc 52] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14e11f2d4c83]
[libsc 52] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14e11f2d7863]
[libsc 52] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14e11f2d7129]
[libsc 52] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14e11f2e45fb]
[libsc 52] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14e11f2e4624]
[libsc 52] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 52] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14e11a83e24d]
[libsc 52] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 54] Abort: Obtained 11 stack frames
[libsc 54] Stack 0: libsc.so.2.0.0(+0xda15) [0x15090ce9da15]
[libsc 54] Stack 1: libsc.so.2.0.0(+0xb69c) [0x15090ce9b69c]
[libsc 54] Stack 2: libc.so.6(+0x4adc0) [0x150908653dc0]
[libsc 54] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x15090d08ac83]
[libsc 54] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x15090d08d863]
[libsc 54] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x15090d08d129]
[libsc 54] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x15090d09a5fb]
[libsc 54] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x15090d09a624]
[libsc 54] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 54] Stack 9: libc.so.6(__libc_start_main+0xef) [0x15090863e24d]
[libsc 54] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 13] Abort: Obtained 11 stack frames
[libsc 13] Stack 0: libsc.so.2.0.0(+0xda15) [0x14699e0dea15]
[libsc 13] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14699e0dc69c]
[libsc 13] Stack 2: libc.so.6(+0x4adc0) [0x146999853dc0]
[libsc 13] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14699e2cbc83]
[libsc 13] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14699e2ce863]
[libsc 13] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14699e2ce129]
[libsc 13] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14699e2db5fb]
[libsc 13] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14699e2db624]
[libsc 13] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 13] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14699983e24d]
[libsc 13] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 49] Abort: Obtained 11 stack frames
[libsc 49] Stack 0: libsc.so.2.0.0(+0xda15) [0x1456674e7a15]
[libsc 49] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1456674e569c]
[libsc 49] Stack 2: libc.so.6(+0x4adc0) [0x145662c53dc0]
[libsc 49] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1456676d4c83]
[libsc 49] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1456676d7863]
[libsc 49] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1456676d7129]
[libsc 49] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1456676e45fb]
[libsc 49] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1456676e4624]
[libsc 49] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 49] Stack 9: libc.so.6(__libc_start_main+0xef) [0x145662c3e24d]
[libsc 49] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 51] Abort: Obtained 11 stack frames
[libsc 51] Stack 0: libsc.so.2.0.0(+0xda15) [0x1477798b8a15]
[libsc 51] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1477798b669c]
[libsc 51] Stack 2: libc.so.6(+0x4adc0) [0x147775053dc0]
[libsc 51] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x147779aa5c83]
[libsc 51] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x147779aa8863]
[libsc 51] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x147779aa8129]
[libsc 51] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x147779ab55fb]
[libsc 51] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x147779ab5624]
[libsc 51] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 51] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14777503e24d]
[libsc 51] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 9] Abort: Obtained 11 stack frames
[libsc 9] Stack 0: libsc.so.2.0.0(+0xda15) [0x14fdae2e9a15]
[libsc 9] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14fdae2e769c]
[libsc 9] Stack 2: libc.so.6(+0x4adc0) [0x14fda9a53dc0]
[libsc 9] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14fdae4d6c83]
[libsc 9] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14fdae4d9863]
[libsc 9] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14fdae4d9129]
[libsc 9] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14fdae4e65fb]
[libsc 9] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14fdae4e6624]
[libsc 9] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 9] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14fda9a3e24d]
[libsc 9] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 23] Abort: Obtained 11 stack frames
[libsc 23] Stack 0: libsc.so.2.0.0(+0xda15) [0x1468a180ba15]
[libsc 23] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1468a180969c]
[libsc 23] Stack 2: libc.so.6(+0x4adc0) [0x14689d053dc0]
[libsc 23] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1468a19f8c83]
[libsc 23] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1468a19fb863]
[libsc 23] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1468a19fb129]
[libsc 23] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1468a1a085fb]
[libsc 23] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1468a1a08624]
[libsc 23] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 23] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14689d03e24d]
[libsc 23] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 31] Abort: Obtained 9 stack frames
[libsc 31] Stack 0: libsc.so.2.0.0(+0xda15) [0x1455fdf41a15]
[libsc 31] Stack 1: libsc.so.2.0.0(sc_calloc+0x180) [0x1455fdf3fbc0]
[libsc 31] Stack 2: libt8.so.2.0.0-982-gce8365c89(+0x625c0) [0x1455fe1315c0]
[libsc 31] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1455fe131129]
[libsc 31] Stack 4: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1455fe13e5fb]
[libsc 31] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1455fe13e624]
[libsc 31] Stack 6: t8_time_forest_partition() [0x402a86]
[libsc 31] Stack 7: libc.so.6(__libc_start_main+0xef) [0x1455f983e24d]
[libsc 31] Stack 8: t8_time_forest_partition() [0x401dba]
[libsc 43] Abort: Obtained 11 stack frames
[libsc 43] Stack 0: libsc.so.2.0.0(+0xda15) [0x1495bf257a15]
[libsc 43] Stack 1: libsc.so.2.0.0(+0xb69c) [0x1495bf25569c]
[libsc 43] Stack 2: libc.so.6(+0x4adc0) [0x1495baa53dc0]
[libsc 43] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x1495bf444c83]
[libsc 43] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x1495bf447863]
[libsc 43] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x1495bf447129]
[libsc 43] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x1495bf4545fb]
[libsc 43] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x1495bf454624]
[libsc 43] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 43] Stack 9: libc.so.6(__libc_start_main+0xef) [0x1495baa3e24d]
[libsc 43] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 53] Abort: Obtained 11 stack frames
[libsc 53] Stack 0: libsc.so.2.0.0(+0xda15) [0x14f93c60ea15]
[libsc 53] Stack 1: libsc.so.2.0.0(+0xb69c) [0x14f93c60c69c]
[libsc 53] Stack 2: libc.so.6(+0x4adc0) [0x14f937e53dc0]
[libsc 53] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x14f93c7fbc83]
[libsc 53] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x14f93c7fe863]
[libsc 53] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x14f93c7fe129]
[libsc 53] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x14f93c80b5fb]
[libsc 53] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x14f93c80b624]
[libsc 53] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 53] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14f937e3e24d]
[libsc 53] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 55] Abort: Obtained 11 stack frames
[libsc 55] Stack 0: libsc.so.2.0.0(+0xda15) [0x149470c74a15]
[libsc 55] Stack 1: libsc.so.2.0.0(+0xb69c) [0x149470c7269c]
[libsc 55] Stack 2: libc.so.6(+0x4adc0) [0x14946c453dc0]
[libsc 55] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x149470e61c83]
[libsc 55] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x149470e64863]
[libsc 55] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x149470e64129]
[libsc 55] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x149470e715fb]
[libsc 55] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x149470e71624]
[libsc 55] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 55] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14946c43e24d]
[libsc 55] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 18] Caught signal SEGV
[libsc 18] Abort: Obtained 11 stack frames
[libsc 18] Stack 0: libsc.so.2.0.0(+0xda15) [0x148e60e83a15]
[libsc 18] Stack 1: libsc.so.2.0.0(+0xb69c) [0x148e60e8169c]
[libsc 18] Stack 2: libc.so.6(+0x4adc0) [0x148e5c653dc0]
[libsc 18] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x148e61070c83]
[libsc 18] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x148e61073863]
[libsc 18] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x148e61073129]
[libsc 18] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x148e610805fb]
[libsc 18] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x148e61080624]
[libsc 18] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 18] Stack 9: libc.so.6(__libc_start_main+0xef) [0x148e5c63e24d]
[libsc 18] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 19] Caught signal SEGV
[libsc 19] Abort: Obtained 11 stack frames
[libsc 19] Stack 0: libsc.so.2.0.0(+0xda15) [0x146707ad3a15]
[libsc 19] Stack 1: libsc.so.2.0.0(+0xb69c) [0x146707ad169c]
[libsc 19] Stack 2: libc.so.6(+0x4adc0) [0x146703253dc0]
[libsc 19] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x146707cc0c83]
[libsc 19] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x146707cc3863]
[libsc 19] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x146707cc3129]
[libsc 19] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x146707cd05fb]
[libsc 19] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x146707cd0624]
[libsc 19] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 19] Stack 9: libc.so.6(__libc_start_main+0xef) [0x14670323e24d]
[libsc 19] Stack 10: t8_time_forest_partition() [0x401dba]
[libsc 39] Caught signal SEGV
[libsc 39] Abort: Obtained 11 stack frames
[libsc 39] Stack 0: libsc.so.2.0.0(+0xda15) [0x149ef404ea15]
[libsc 39] Stack 1: libsc.so.2.0.0(+0xb69c) [0x149ef404c69c]
[libsc 39] Stack 2: libc.so.6(+0x4adc0) [0x149eef853dc0]
[libsc 39] Stack 3: libt8.so.2.0.0-982-gce8365c89(t8_element_array_get_size+0x3) [0x149ef423bc83]
[libsc 39] Stack 4: libt8.so.2.0.0-982-gce8365c89(+0x62863) [0x149ef423e863]
[libsc 39] Stack 5: libt8.so.2.0.0-982-gce8365c89(t8_forest_partition+0x369) [0x149ef423e129]
[libsc 39] Stack 6: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0x9fb) [0x149ef424b5fb]
[libsc 39] Stack 7: libt8.so.2.0.0-982-gce8365c89(t8_forest_commit+0xa24) [0x149ef424b624]
[libsc 39] Stack 8: t8_time_forest_partition() [0x402a86]
[libsc 39] Stack 9: libc.so.6(__libc_start_main+0xef) [0x149eef83e24d]
[libsc 39] Stack 10: t8_time_forest_partition() [0x401dba]

@Davknapp
Copy link
Collaborator

Hey @DamynChipman ,
can you provide us the input parameters for this run, to reproduce it?
In general you shouldn't have any problems running this example on a cluster, we have tested it using much more than 32 procs. But this has been a couple of years ago, so we will double check again, especially with your input parameters.

One part of your output might indicate that you just ran out of memory:

[libsc 31] Abort: Returned NULL from malloc
[libsc 31] Abort: /home/dchipman1/packages/t8code/sc/src/sc.c:398

Maybe your parameter-combination produced a mesh that was to large to handle for the machine?

@jmark
Copy link
Collaborator

jmark commented Oct 10, 2024

@DamynChipman I tried to run this benchmark example on my machine with 64 ranks. It runs just fine.

mpirun -n 64 ~/install/t8code/main/bin/t8_time_forest_partition -g -b -C 0.8 -x -0.4 -X -0.3 -l 4 -r 5 -L -o -T 0.025

I use MPICH version 4.0.2 and gcc version 12.1.0 on a Linux Kernel 6.8.0-45-generic and 22.04.1-Ubuntu.

@DamynChipman
Copy link
Author

Large (ish) Scale Results

Machine: Falcon (Dual Intel Xeon 18 core nodes)
Command:

mpirun -n $N ./t8_time_forest_partition -g -b -C 0.8 -x -0.2 -X 0.2 -l 4 -r 4 -L -o -T 0.025 >> output_n$N.txt

Results:

1 2 4 8 16 32 64 128 256 512
Adapt [sec] 22.198 15.3656 4.88433 3.51135 2.74351 1.75711 1.25502 0.832045 0.963431 2.83504
Ghost [sec] 0 4.92692 2.57878 2.4864 1.56907 0.939954 0.627435 0.385915 0.289718 0.207101
Partition [sec] 3.50545 1.78216 0.728757 0.433819 0.198721 0.152427 0.0674145 0.0380946 0.0197075 0.00966559
Total [sec] 45.7854 35.9366 12.8784 10.5972 6.90435 4.21385 3.01467 2.16236 3.01856 4.97634
Total Elements 34723392 34723392 34723392 34723392 34723392 34723392 34723392 34723392 34723392 34723392

Comments

Turns out the issue was just the machine I was running on (it's a rather picky cluster that is currently being worked on as a decommissioned national lab computer). I was able to run up to 512 MPI ranks with runs beyond that failing due to the machine configuration. For the purposes of reproducing this benchmark, I am satisfied and impressed with the speed and memory footprint for over 30M elements!

Thank you for your guidance on reproducing this result!

@holke
Copy link
Collaborator

holke commented Oct 24, 2024

Thank you for the update and the praise ;)
We are happy that we could resolve your questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion examples Edits in our examples
Projects
None yet
Development

No branches or pull requests

4 participants